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AN ANALYSIS OF THE ELEMENTARY 
PSYCHOLOGY COURSE AT THE 
OHIO STATE UNIVERSITY 


F. C. DOCKERAY AND W. L. VALENTINE? 
Ohio State University 


ROBLEMS in the improvement of instruction necessitate 
Pp a clear statement of the objectives of teaching so that 
there can be no misunderstanding of the direction in 
which it is hoped that improvement will take place. Broad 
generalizations, such as ‘‘character building’’ or ‘‘ personality 
development,’’ as instructional aims are valueless because they 
are complexes of no one knows how many variables. The com- 
ponent parts of any broad objective are related to each other 
in subtle ways that defy analysis into causal relations as dis- 
tinct from adventitious combinations. Valuable objectives are 
those which are stated in such a way that they are susceptible 
to measurement. In addition they must fit into the general 
aims of higher education. 

Too frequently general service courses in academic subjects 
have been limited in their content and method by the personal 
interest of the instructor in charge. Sometimes they are ex- 
clusively held to be preparatory to further work in the same 
subject. As a result, they serve no wide function in the pro- 
gram of either the institution or department where they are 
offered. They are frequently very worthwhile for a limited 


1 The authors feel that they have functioned very largely in the capac- 
ity of editors rather than authors of the paper. A general service course 
cannot and should not be the exclusive property of one or two men as an 
advanced course always is. They acknowledge the debt they owe to the 
following people who helped to crystallize these notions: Dean George F. 
Arps, Professors H. B. English, 8. L. Pressey, H. A. Toops, Mervin A. 
Durea, Emily Stogdill and a corps of instructors. The authors do hold 
themselves responsible for the present form of the paper. 
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few relatively mature students, but they fail to meet the needs 
of a large majority who register in them to satisfy college re- 
quirements. Others, in an attempt to meet the requirements of 
a variety of students, have been trivial and sophomoric. They 
contribute nothing that the student does not already know 
when he enters the course. 

A general service course should be carefully designed to co- 
ordinate with the objectives of the university as a whole. It 
is true that at present there is no agreement regarding these 
objectives, excepting in a broad general way. If we assume, 
however, that fundamentally a university is preparing its 
students to take their places in an adult, industrialized, coop- 
erative society by teaching the skills, knowledge, and attitudes 
that make for creative citizenship and superior social and voca- 
tional attainments, we can relate our objectives to this assump- 
tion. 

It is daily becoming more apparent that an industrial society 
means more leisure time for a large proportion of a population 
and in view of this fact, although avocational objectives are 
implicit in our previous assumption, it is well to make explicit 
this additional requirement of the university’s responsibility. 

Of late there has been a growing feeling among social phi- 
losophers that universities should be doing more in the way of 
developing a sense of social responsibility—creating the habit 
of putting one’s self in the other one’s place. By virtue of its 
subject matter, psychology is in a particularly strategic posi- 
tion to lay the ground for satisfying this objective, but the 
typical academic course as now organized contributes little in 
this connection. 

The formulation of objectives is simply an intellectual exer- 
cise if they are agreed upon and immediately forgotten. On 
the other hand, if the objectives are too inflexible, they result 
in stereotyped subject matter. We look upon our list as tenta- 
tive and vary the emphasis on the individual items according 
to the requirements of an evolving society. Our list has been 
substantial enough to discourage an enthusiastic addition to it 
of objectives of trivial importance and transitory interest. 
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The problem of formulating a list of objectives is so inti- 
mately related to that of measuring how well the objectives 
have been attained that the two can hardly be discussed sepa- 
rately. The objectives determine the type of test to be used 
and the results of testing always demand a reformulation of 
objectives. Thus, what was originally thought to be a single 
objective may turn out to be two or more entirely unrelated 
aims, and conversely. This testing technique prevents mere 
lip service to the objectives on the part of the individual in- 
structor. It is at this point that most previous lists of aims 
have broken down. Some items are simply so broad that they 
can never be tested ; for others, existing techniques of measure- 
ment are impotent. Both of these problems offer a challenge 
to a progressive educational technician, but probably have no 
place in a course where the fundamental aim is to do a few 
things well. 

The problem of method is also intimately related to the ob- 
jectives and the testing program. Among others, the problem 
of spacing the tests falls in this category. If a skill, or a 
knowledge, or an attitude is fundamental to the later under- 
standing of the principles developed in a course, there is no 
reason why it should not be tested in the early part of the 
course. An arbitrary standard of mastery could be set (by 
experiment) and further testing specifically for this item could 
be waived as soon as the criterion of mastery is met. It is fool- 
ish to suppose that the mid-term or end-term test or examina- 
tion is any more important than any other one given during 
a semester. It is likewise bad method to wait until the end 
of a term to measure the acquisition of an important skill or 
the crystallization of a significant attitude because it is by that 
time too late to do anything about the results until the next 
semester and with a different group of students. The experi- 
mentally demonstrated instructional value of taking an exami- 
nation is thus entirely lost. 


A LIST OF OBJECTIVES FOR ELEMENTARY PSYCHOLOGY 


The five objectives isolated here have no correspondence 
with the temporal sequence of the two beginning courses offered 
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at Ohio State. We do not acquire all of the facts and then 
apply them. We do not attempt to eliminate superstition one 
day and teach scientific method the next. If there is any 
tendency to generate fractionated methods of this type as a 
result of isolating objectives, then they had better remain unde- 
fined. We have observed no tendency in this direction in our 
own work. But if this analysis makes it appear to be true in 
spite of a denial, then the presentation is inadequate. 


1. The acquisition of (a) facts and (b) principles of human 
behavior. 


2. The practical application of psychological principles to 
the problems and contacts of daily life. 

3. The acquisition of a technical vocabulary. 

4. The acquisition of a skill in the application of scientific 


method to problems in human behavior. 
The elimination of wide-spread superstitions and miscon- 
ceptions regarding human behavior. 


on 


THE ACQUISITION OF FACTS AND PRINCIPLES 


The acquisition of facts about human behavior is mentioned 
explicitly and first in our list for two reasons. First, contrary 
to the belief of some enthusiastic educators, we hold that a 
reasonable quantity of well-demonstrated fact is an essential 
part of the equipment of any well educated citizen. The selec- 
tion of the facts and the degree of detail required is a philo- 
sophie rather than a scientific problem at present. Secondly, 
objective techniques for the measurement of acquired facts are 
easily constructed. As a matter of fact, they are too easily 
constructed, for they have helped to maintain the mistaken 
view that a’mastery of fact alone ensures the proper and ade- 
quate use of these facts in constructive thinking. 

Although as far as we are aware, no educator has actually 
said so, some apparently believe that the human organism is 
somehow limited in its capacity much like a bushel basket, and 
that if a large proportion of this volume is taken up with facts, 
then there is no room left for more desirable acquisitions. A 
view of this type is absurd. What is really limited in aca- 
demie courses is the time one can devote to the cultivation of 
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desirable accomplishments. Since mastery of facts does not 
in and of itself ensure their functional application, it is abso- 
lutely necessary to devote some time to training in the use of 
the information acquired. In the past this training has been 
either incidental or lacking altogether. Training in the every- 
day use of information acquired simply becomes another objec- 
tive of a course along with, not to the exclusion of, the acquisi- 
tion of facts. Measured by the ordinary true-false test in 
which factual material is stressed, it is obvious that under these 
conditions even a good student might score poorly because he 
would have fewer facts at his command, but those that he did 
have and those that are furnished him would have a functional 
value that is truly significant in productive scholarship. The 
optimum distribution of the time spent on the acquisition of 
facts and on training in their application remains to be dis- 
covered together with the extent to which the habit of the fune- 
tional application of acquired information can be generalized. 
The particularly vicious aspect of so-called standardized tests 
for psychology which have been published lies in their use in 
determining educational methods on the basis of the results. 
They should be recognized for what they are, tests of very ele- 
mentary facts about anatomy, particularly the anatomy of the 
nervous system. A given procedure might very well be best 
for attaining this objective, but to generalize that methods so 
selected are best for the cultivation of other more significant 
objectives is an unjustifiable inference. 

Facts and principles are not mutually exclusive categories. 
They rather comprise a hierarchy of propositions from the ex- 
tremely restricted and detailed at one extreme to the highly 
generalized at the other. There is no logical reason why the 
psychological process of recall should be any different for facts 
and principles. 

Items cannot be judged either facts or principles by inspec- 
tion. Let us take as an example the items found in a para- 
graph from the text-book used in the course: 


1. Psychology is a branch of science. 
2. Psychology attempts to formulate the laws of human 
behavior. 
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3. Psychologists are particularly interested in how man gets 
along in a social world. 

4. Man is above all else a social creature. 

5. Social development has resulted in many individual 


problems where the best interests of the individual and 
society are not identical. 


From one point of view we have five relatively unrelated 
facts. If they appeared as items scattered through a quiz, 
they would certainly be so classified. But from another point 
of view they are broad generalizations. That ‘‘ psychology is 
a branch of science’’ or that ‘‘ psychology attempts to formu- 
late the laws of human behavior’’ is a generalization based on 
literally thousands of observations on what psychologists are 
doing and how they are doing it. It is not likely that a stu- 
dent will make these observations or that he will consider any 
of the five items any more than a simple fact. This is likewise 
true of many other items that have been considered broad gen- 
eralizations, a fact which will be apparent to any one who has 
asked students ‘‘ Now just why did you mark that item the way 
you did?’’ One student replies: ‘‘I remember that that is 
what the book says.’ Another indicates a rather comprehen- 
sive inferential process. An item that for one student involves 
simple recall will generate in another a rational process of inde- 
pendently arriving at a conclusion. Herein lies the real basis 
for the distinction between fact and principle. 


2A complete catalog of the facts presented to the students in this 
eourse would be simply a sentence outline of the text-book similar to the 
example given above. 

The problem of the instructors is to select from these thousands of 
details those which are necessary for the development of the generaliza- 
tions presented later in this paper and are important enough to warrant 
the expectation that the student will remember them not only for the pur- 
poses of the course, but for an indefinite time after the course is finished. 
Their proper and satisfactory choice constitutes the art rather than the 
science of teaching. 

**Text-book’’ will be used throughout this discussion although actually 
assigned readings, moving pictures, laboratory experiments, demonstra- 
tions, and discussion notes all make up the available material upon which 
tests of mastery are based. 
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One of the reasons frequently given for the retention of the 
unreliable essay type of examination is that the questions are 
more likely to release reasoning responses than objective 
quizzes. That students now report that they study differently 
for the two. types of examinations substantiates this point. 
But that they say that they give more attention to burdensome 
detail in studying for the objective quiz is no indictment of 
objective quizzes generally, excepting that too many objective 
quizzes are now poorly constructed. The student must under- 
stand the objectives of the course so that he can most efficiently 
expend his effort. If he is told that the quizzes will contain 
no verbatim statements and that the paraphrased statements 
will involve an understanding of the principles, this difficulty 
ean be obviated.’ 

We have made no attempt to isolate the facts studied in the 
elementary courses beyond the construction of vocabulary 
tests. The elementary staff has attempted to construct a mas- 
ter list of important principles for its own guidance in instruc- 
tion. <A typical principle would be: 


Generalizations based on a few specific cases do not always 
apply to all members of a group. 


Or another, 


If an event occurs under condition A, A may not be 
uniquely related to the event (principle of control). 
Conditions B, C, D, ete., must also be examined. 


The specific illustrations (facts) which the instructor uses to 
establish these principles is immaterial. They may vary 
widely from instructor to instructor or from time to time for 
the same instructor. Within limits the temporal sequence may 
vary. The important point is that these two generalizations 
have been treated some time in connection with whatever sub- 
ject matter appeared to be favorable for the establishment of 
these principles. 


3 The type of examination that we give permits of the open-book 
method which we have used in some sections with satisfactory results. 
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The complete list at present numbers some sixty generaliza- 
tions of this type.* It has served as a convenient and highly 
useful check list for the instructor and a source for items in- 
tended to measure an understanding of the course as opposed 
to a specific memory for the subject matter. 

Again the question of method forces itself into our presen- 
tation. If one simply takes this list to class and in effect dic- 
tates it to the students, the objectives will not be met, particu- 
larly if he only tests for a knowledge of the principles. The 
best way to generate principles of this kind is to permit the 
student to isolate for himself the essential elements from the 
complex matrix in which they occur. The task of the instrue- 
tor is to arrange the material in such a way that insight and 
meaning may occur. It has been demonstrated that the in- 
structor can give rather pointed ‘‘hints’’ regarding the direc- 
tion in which the ultimate solution will lie, and the student 
may employ these hints and still not know that he has been 
helped in arriving at a solution of his problem. Principles 
isolated in this way have a functional value (which may be 
tested in their application to other problems which the student 
has not dealt with before) that dictated principles do not have. 
As soon as he understands that the method by which a solution 
is attained is as important as the conclusion and that he will 
be graded on his understanding of rather than his knowledge 
of principles, he will cease to be maimly interested in the out- 
come of a demonstration or the answer ‘‘in the book.’’ 


THE PRACTICAL APPLICATION OF PSYCHOLOGICAL PRINCIPLES 
TO THE PROBLEMS AND CONTACTS OF DAILY LIFE 


Instructors in psychology for years have been making allu- 
sions to the practical problems of daily life in connection with 
the application of psychology. Some have apparently been 
more successful than others if the number of ‘‘cases’’ that they 
treat is any criterion. There has not been, however, any sys- 
tematic attempt to measure how well the student group as a 


4 The complete mimeographed list will be sent to any one who is inter- 
ested. 
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whole can apply the principles in which they have had instruc- 
tion. There is even some doubt that the rigid application of 
the ‘‘number of cases criterion’’ of success in this respect is 
actually significant. It may simply mean that the instructor 
has encouraged a feeling of dependency—an inability to face 
one’s problems without help. But be that as it may, the tacit 
assumption here as in all other subjects has been that informa- 
tion about a subject is highly correlated with a skill in apply- 
ing it. This assumption is not justified on the basis of 
measurement. 

As indicated above, the traditional true-false and multiple- 
choice techniques do not lend themselves to the objective 
measurement of an ability to apply the principles isolated in 
the course. In only a few cases can one write items which in- 
volve one and only one ‘‘correct’’ application of a principle. 
Whereas, an item of simple fact is either right or wrong, yes-no 
answers have to be qualified in these more generalized cases. 

We have therefore adopted a technique recently suggested 
by Smeltzer® for the objective measurement of applied infor- 
mation and used by other investigators as well. The practical 
problem is presented and five solutions are given. These are 
selected so that there is a best and a poorest example of the 
application of some principle with others lying between. The 
exact weights vary from item to item. The following direc- 
tions are typical for this kind of test : 


This is a test of your ability to apply the principles that you 
have learned in this course to practical situations. After each 
problem there are a number of solutions. Choose the one most 
in agreement with the principles you have learned here and 
give it a weight of 5. Select the poorest solution and give it 
a weight of 1. Be certain that you have one 5 and onel. The 
other numbers—4, 3, 2—are used for the other solutions. Any 
of these three may be used more than once. Write the num- 
bers in the space provided : 


5 best solution 
4 good (or next best) solution 
5 Smeltzer, C. H., ‘‘ Objective Measurement of Applied Information,’’ 
Jour. Appl. Psychol., 1933, 17, 765-771. 
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3 mediocre solution 
2 poor solution 
1 poorest solution 


A key is then prepared based on the independent judgments 
of the instructors. Items upon which there is a wide diver- 
gence of opinion are modified or eliminated. In case of a dif- 
ference of opinion of one point, the majority judgment is 
taken.° 

The students’ papers are scored in terms of deviation from 
the key. No answer is wrong; it simply does not agree with 
the judgment of a group of competent instructors. The maxi- 
mum deviation is 4, the least 0, so that small scores indicate 
high grades. 

In educational psychology classes Smeltzer found that scores 
on tests scored in this way correlated to the extent of .63 with 
conventional multiple-choice items of the informational type 
and about .27 with intelligence. Our own results substantiate 
these findings indicating a truly specific skill which is not iden- 
tical with either intelligence or acquired information, although 
more intimately related to information than to intelligence. 

Best results are obtained in testing these functions if situa- 
tions can be obtained which are not treated specifically in the 
text. Otherwise, verbatim memory will be tested rather than 
the ability to apply principles. There is now available infor- 
mation which indicates that these are really two related, but 
not identical, phases of human behavior. 

It is difficult to construct tests of applied information that 
are not better intelligence tests than tests of accomplishment. 
For educational psychology Smeltzer found his test valid in 
the sense that students of wide practical experience in the 
schoolroom made distinctly better scores than those who had 
had no experience. That intelligence alone could not explain 
the results was likewise demonstrated by him. Unfortunately, 

6 It is frequently overlooked that this procedure is identical with that 
of ascertaining whether or not an item is a fact. Students and some 


instructors get the notion that a T or an F has some essence aside from 
an expression of agreement among competent judges. 
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validating groups of wide practical experience do not exist for 
general psychology, but our experience indicates that this diffi- 
culty is not insurmountable. 

We have elaborated the practical application of psychologi- 
cal principles because it is the specific skill that psychology has 
to offer to the developing student. Other sciences have labora- 
tory skills that serve vocational objectives. Public speaking 
offers an important skill the value of which is acknowledged 
by psychology, but which psychologists do not teach.’ The 
special skill which we do offer is that involved in ordering indi- 
vidual behavior in conformity with psychological principles. 
If the student can see in his own behavior the operation of the 
various mechanisms of evasion and correct for them; if he 
understands the methods by which habits are made and broken ; 
if he recognizes and avoids acts indicating childish or adoles- 
cent motives; if he can get some glimpse of the reasons for 
seemingly unreasonable acts of those in authority over him; 
if he can do all these things and others, he is making adequate 
use of the psychological principles that he has learned.* 

The practical application of psychological principles to 
everyday problems and contacts is the objective of what is now 
ealled, for the want of a better name, ‘‘mental hygiene.’’ We 
seriously considered at one time the explicit mention of ‘‘men- 
tal hygiene’’ in our list of objectives. We made a list of the 
principles of mental hygiene which would find application in 
a beginning course. They turned out to be repetitions of our 
own principles. The particular value of mental hygiene lies 


7 We have, as all other instructors have, made use of the student oral 
report, but nevertheless we are not primarily interested in the technique 
of delivery. We have recently made some tentative trials of the panel 
discussion as an illustration of a method by which social problems can be 
set and discussed by intelligent, informed citizens; but we do not feel 
that this is a technique upon which psychology has any special claim. 

8 As is the case for every other skill, one finds occasional people who, 
without formal training in the discipline have developed a high degree 
of psychological skill, exceeding even professional psychologists in the 
respects named above. Reflection will show that psychology is not the 
only academic subject of which this is true. 
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in its emphasis upon the practical application of these princi- 
ples. Techniques have been listed by mental hygienists which 
are valuable in this connection and have exercised an impor- 
tant influence in our methods, but not in our objectives. We 
do give more time to the accomplishment of the objective under 
consideration than is usually true in academic courses in psy- 
chology. The treatment of specific cases by mental hygienists 
involves giving a highly personalized first course in psychology 
adapted to the particular needs of the case being treated. The 
material is adapted to the patient’s needs and understanding. 
Theory, questions, reservations, method, systematization—all 
are eliminated in favor of quick and radical re-education. 
These techniques are not desirable where dire conflict does not 
exist and in any event could not be accomplished where the 
patient does not actively seek help. 


THE ACQUISITION OF A TECHNICAL VOCABULARY 


A technical vocabulary in psychology serves the immediate 
purposes of the course and is a valuable addition to the general 
vocabulary. The development of the general vocabulary by 
the addition of technical words is of more significance in psy- 
chology (in common with the other social sciences) than would 
be true in some other subjects because psychological words are 
more frequently used in social intercourse. There is a host of 
technical words that have no application in this connection. 
This is particularly true in connection with the work of some 
few authors. The emphasis on vocabulary arose in connection 
with the observation that more than half of what some instruc- 
tors say must be unintelligible to the majority of students. 
Several studies have shown that there is an intimate relation- 
ship between performance on quizzes and vocabulary. Inter- 
views with students in difficulty have shown that a consider- 
able proportion of them fail examinations because they do not 
understand the questions. A sympathetic attitude on the part 
of an instructor during the first part of the course with regard 
to a student’s difficulty in this connection together with some 
drill (this need not be done in class) will help to correct this 
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deficiency. In some courses it has been found the new words 
met by the student are easily equivalent to the demands of a 
foreign language course where the acquisition of a vocabulary 
is a primary objective at first, while we are asking students to 
master the vocabulary incidental to the mastery of subject 
matter.® 


THE APPLICATION OF SCIENTIFIC METHOD TO PROBLEMS 
OF HUMAN BEHAVIOR 


In agreement with the tenor of the foregoing discussion, we 
can assume, until demonstrated otherwise, that skill in the ap- 
plication of scientific method to problems of human behavior 
is relatively uncorrelated with a knowledge of scientific prin- 
ciples. We have further evidence that a skill in this respect is 
not simple, but involves several other distinct sub-skills. The 
variety of these has never been determined experimentally, but 
this problem should yield to the recently developed factor 
analysis techniques. It is further apparent that the specific 
skills demanded for the scientific method in one field have little 
or no value in another. Skill in the use of a microscope has 
no value in the social branches of psychology, but would be 
very valuable in biology or in physiological psychology.’® 
Skill in locating sources of information may be valuable in his- 
tory but of lesser value to psychology. This section of our 
analysis of a beginning course is not, then, a treatise on scien- 
tifie method in its general aspects, but is rather a treatment of 
the specific elements of scientific method which we are now 
using. 

1. Skill in making observations: Skill in making observa- 


*S. L. Pressey found that the technical words appearing in a zoology 
text made a list about four times as long as that required for a mastery 
of first year Latin. 

10 Each of these skills must be given its proper place in the objectives 
of the course. One can imagine a very good course in botany or zoology 
in which the microscope would never be actually used. Still for many 
beginning courses the acquisition of this skill seems to be the fundamental 
objective. Except for those few students who go into advanced courses, 
this training is entirely wasted. 
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tions in another scientific laboratory (chemistry, for example) 
does not transfer to human relations. The objective attitude 
which one can adopt with reference to a galvanometer has to 
be especially cultivated in observing humans. This seems to 
be particularly true with reference to human infants where the 
prevailing conventional responses are sentimental rather than 
scientific. We find that unless especially warned not to do it, 
most of the class when viewing a motion picture of an infant 
learning to creep are comparing him with an infant brother, 
feeling sorry for him, liking him, or disliking him, ete. All of 
these responses require the participation of the total reacting 
mechanism so that they are just as ignorant of the development 
of the creeping response after viewing the film as they were 
before seeing it." The specific training offered in this connec- 
tion is to provide an opportunity to observe salesmen attempt 
to sell some commodities in a ‘‘class shopping experiment.”’ 
The protocols are graded on the basis of the clear separation 
of observations and inferences based on observation. This 
ability is also tested in quizzes designed to measure how well 
the objective has been accomplished using a variety of subject 
matter. The necessity of accurate language habits is stressed 
in this connection.*” 


11 The value of having objectives and being able to measure the extent 
to which they are realized seems to be particularly important in connec- 
tion with the use of educational films. Students have been so habituated 
to the use of films for entertainment purposes only that unless special 
precautions are taken, the film degenerates into an entertainment feature 
purely and simply. Instructors also need special direction in this respect. 
We have personally seen motion picture demonstrations which would bet- 
ter have attained the apparent objective if the class was dismissed and 
sent to a commercial movie house where it could be entertained in more 
pleasant surrouadings than are available in our classrooms. 

12 We had previously thought that the clear distinction between that 
which is observed and that which is inferred was one criterion distinguish- 
ing the literary method from the scientific. A search of literary produc- 
tions for items to use in testing for the attainment of this objective dis- 
closed that the criterion merely served to distinguish between the ‘‘ good’’ 
and the ‘‘ poor’’ in literature. Good literature, like good science, clearly 
labels the inferences that are based on observation while in poor literature 
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2. Getting the negative cases: It is demonstrated that noth- 
ing can be proved unless controls are instituted. The case of 
the ‘‘only child”’ is illustrative. If we find that a proportion 
of only children is neurotic, we still know nothing about the 
relation between neurosis and being an only child. We must 
find out how many children with siblings are neurotic. The 
quizzes measure how well the principle has been learned by 
using material other than that upon which the demonstrations 
were based. 

3. Counting or measuring the results of investigation: For 
years text-books in psychology carried the statement that the 
Babinski sign is the normal reaction to stimulation on the plan- 
tar surface in the newborn. Counting the number of times it 
occurred showed it to have a frequency of about fifty per cent. 
If an event occurs in half the number of cases in which it is 
possible to occur, it is not typical. Appreciation of the neces- 
sity for specificity unattainable by the use of non-mathema- 
tical language and skill in handling numbers is attempted in 
connection with the reaction time experiment. The quizzes are 
designed to measure the skill in choosing class intervals, mak- 
ing tallies, computing means and making simple graphs. 

4. Presenting the results: Accurate language habits again 
are emphasized in connection with methods for presenting re- 
sults. The special techniques of scientific presentation—tables 
and graphs—are illustrated and explained. Skill in present- 
ing results serves a general vocational objective in addition to 
the needs of the course since it is difficult to find a profession 
in which it is not necessary to make reports. The quizzes here 
are designed to measure the skill with which tables and graphs 
can be read. 

5. Drawing conclusions from results: The quizzes are de- 
signed to discriminate between those who are and those who 
are not able to discriminate between statements that are based 





and pseudo-science they are confused. An objective thus originally 
chosen to serve a scientific end becomes valuable to students of literature 
as well. 
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on the results demonstrated and those which are extrapolations 
or generalizations extended beyond the results. 

6. Drawing inferences: The quizzes are designed to measure 
the reasonableness with which generalizations can be drawn.” 
The pooled judgments of the instructors is the basis upon which 
the quizzes are graded (see previous section). 


The direction of future research in this field involves the 
development of tests that will measure the degree to which 
these experiences in connection with psychological data have 
become generalized into a habit of scientific thinking that we 
generally call the scientific attitude. There is some evidence 
that certain college courses result in attitudes of broadminded- 
ness not present before the course was taken. Every instruc- 
tor has seen isolated instances of changed attitude as a result 
of his teaching that from his point of view are altogether 
wholesome. It remains, however, to agree upon these specific 
attitudes that one wishes to change and to develop attitude 
seales for their measurement. Tentatively we suggest that 
most progress will be made in the development of those scales 
that measure a critical attitude toward the generally accepted ; 
suspended judgment in the case of insufficient or contradictory 
evidence ; and openmindedness for the new together with free- 
dom from sentimentality, narrow prejudice and personal bias. 
To what extent these habits can be generalized awaits the 
measurement of the variation in attitude as a consequence of 
varied methods in attaining this objective. From the techni- 
cal standpoint these attitude scales will have to be free from 
the possibility that a student will be able to mark his papers 
with his fingers crossed, only temporarily adopting attitudes 
that conform with the course objectives. The impossibility of 
constructing such scales at the present time has impeded prog- 
ress in this field. 


13 Tyler, R. W., ‘‘Measuring the Ability to Infer,’’ Educ. Research 
Bull., 9, 1930, 475-480. 
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CORRECTION OF WIDESPREAD MISCONCEPTIONS AND 
SUPERSTITIONS ABOUT HUMAN BEHAVIOR 


Primitive magic and science have in common the ultimate 
aim of being able to understand and through proper agencies 
to control the natural processes which go on independent of 
man. From very early time magic was also interested in what 
we now call the process of human behavior, but until most 
recently little has been done in applying the proven scientific 
methods to the problem of human life. Even civilized society 
clings to ancient superstition and unproven observation while 
in primitive communities the burden of unfounded fears, 
always the accompaniment of superstition and ignorance, is 
well nigh unbearable. The correction of these superstitions 
is little short of catechetic if the true distinction cannot be 
made on the basis of different objectives as we have already 
indicated. Neither can one say that magical practices are 
always wrong and scientific methods invariably blaze new 
trails in utter darkness. That Peruvian Indians knew for 
eenturies that quinine is a specific for malaria or that Arabs 
knew that mosquitoes carried the same disease is of little im- 
portance when we consider the manner of their knowing. 
It is of little surprise that of the thousands of magical prac- 
tices extant some few should agree with scientific findings. 
Isolated examples of this kind cannot justify the use of faulty 
analogy, uncontrolled observation, hasty generalization, and 
erroneous inference in truly scientific studies. 

Without going into the details of our work, suffice it to say 
that we have made measurements on the burden of super- 
stition brought by the student when he enters the course and 
compared them with measurements made again at the end. 
Significant improvements have been demonstrated in this re- 
gard. We have more insight into the regions requiring special 
emphasis and some knowledge of previously wasted effort in 
correcting superstitions that did not exist in our student 
population. 
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We are not in a position at present to estimate what the 
fulfillment of the objectives in the beginning courses in psy- 
chology will be as a contribution to the wider program of the 
university in the development of its young men and women. 
That attitudes are changed and improved is a matter of pious 
hope rather than a demonstrated fact though this hope is not 
without some objective evidence to support it. The major 
contribution of these two courses is their training in precise 
and realistic thinking in the solution of social problems. The 
student is better prepared to view with sympathy and under- 
standing those institutions with which he so often finds himself 
in conflict. He has been trained to use only observable, real 
and objective factors in the formulation of judgments and ex- 
planations. Wishful solutions have not been permitted in 
grappling with problems of human nature. All this con- 
tributes to that personality trait which is frequently desig- 
nated as ‘‘emotional maturity,’’ the acquisition specifically 
of those concrete habits which are indicative of a mature ad- 
justment to life and its problems. 














THE IMPORTANCE OF THE MECHANICAL 
FEATURES OF AN ADVERTISEMENT 


LEONARD W. FERGUSON 
Stanford University 


N a survey of the advertisements appearing in the Turlock 
Daily Journal’ the writer has found results which differ 
considerably from those based on surveys of advertise- 

ments appearing in metropolitan papers. Contrary to popular 
and scientific belief it was found that there was no relationship 
between the size of an advertisement and its attention value; 
that there were no preferred positions; and that neither the 
right nor the left hand page had any advantage over the other. 
It was further found that position on the page had no effect on 
the attention value of the advertisement; and finally, that the 
day on which the advertisement appeared had more effect on 
its attention value than any other mechanical factor. 

From a selected list of names, chosen so that they would be 
geographically representative of the newspaper’s subscribers, 
the names of 98 subscribers were secured who submitted to the 
interviews required for the survey. These subscribers may be 
classified as in Table I. 

The writer and his two assistants interviewed the 98 sub- 
seribers on Tuesday, Wednesday, Friday, and Saturday of the 
week beginning June 18, 1934. As the Turlock Daily Journal 
is published at 5: 00 p.m. the interviewing always began on the 
following day. Thus the interviews for Monday’s paper were 
made on Tuesday, those for Tuesday’s paper were made on 
Wednesday, ete. Altogether the advertisements in Monday’s, 
Tuesday’s, Thursday’s, and Friday’s paper were tested. 

1 This is the only daily paper in Turlock, California. It has a circula- 
tion of 2000 with an estimated adult reader population of 6000. The town 
of Turlock is situated in the southern part of Stanislaus county, approxi- 
mately 100 miles southeast of San Francisco. Within its city limits it has 


about 4000 inhabitants, but if the surrounding territory is included it is 
estimated that it has a buying population of 10,000. 
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TABLE I 
DAY OF ISSUE MEN WOMEN TOTAL 
Soph IRE AEE endo 7 13 20 
NE saiecttinsiiiiiccion 10 20 30 
IIT cetndemnitallanee 16 12 28 
BENE: ‘sntniiteddnivpnstitdesita 10 10 20 
SOOO: inciisilciitadaiiicches 43 55 98 





To each of the 98 subscribers was given a pile of advertise- 
ments containing all those which had appeared in the previous 
day’s paper as well as an equal number which had appeared a 
year previous. Each subscriber was requested to look through 
the pile of advertisements and pick out all those which he 
thought he had seen in the particular issue of the paper in 
question. In so doing he was allowed to assign a grade of 100 
per cent, 75 per cent, or 25 per cent validity to the recognition 
of each advertisement selected. 

By an elaborate system of grading the data pure guessing 
was eliminated and the responses of each person interviewed 
were weighted in accordance with his accuracy of judgment.” 

In Table II are given the average attention values of various 
sized advertisements.* There does not appear to be, from the 
data presented in Table II, any marked relationship between 
size and attention value. In fact, it would be safer to say that 
there is no relation at all between size and attention value for 
the figures in Table II must be interpreted rather carefully. 
In the first place the figures in Table II are averages, and al- 
though this is the best statistical estimate that can be made, it 
carries with it no guarantee of accuracy. In the second place 
each figure which appears in Table II is influenced by three 
things and possibly by others. Each figure is based on a dif- 
ferent number of advertisements, each figure is based on the 

2 For a complete description of this method see E. K. Strong, The Effect 
of Length of Series upon Recognition Memory, Psychol. Rev., November, 
1912, 19, 447-462. 

8 The size ratio of an advertisement was computed by multiplying its 


length (in inches) by the number of columns it covered. Thus a 5” 2 
column advertisement would have a size ratio of 10. 
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TABLE II 
Average Attention Values of Advertisements of Different Sizes 





ATTENTION VALUES 














SIZE RATIO 
Men Women Total 
1 26 30 .28 
13 .28 37 33 
2 29 38 34 
23 .29 .38 32 
3 35 -29 30 
33 15 40 32 
+ 36 .57 45 
4 40 46 38 
5 .26 31 32 
53 .00 25 16 
6 38 68 49 
10 19 52 35 
16 A5 .62 54 
19 29 64 44 
20 19 A2 30 
27 18 26 24 
30 .26 49 36 
34 66 45 54 
36 10 42 32 
48 36 52 43 
55 26 33 .30 
56 32 82 54 
60 36 79 54 
80 94 68 .68 
160 32 .76 52 





responses of a different number of people, and each figure is 
based on advertisements which appeared on different days. 
The other factors which might have had an influence are the 
mechanical features of each advertisement although it is rather 
doubtful that they did have an effect. 

In view of the foregoing and the high probable errors* of 
the percentages shown in Table II, the writer concludes that 
at least for advertisements appearing in the Turlock Daily 
Journal there is no relationship between attention value and 
size. 

4 Because of the considerations set forth above it was felt by the writer 
that it would be misleading to show PE’s. 
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The attention values for advertisements which appeared on 
different pages are shown in Table II. The values for men and 


TABLBDE III 
Average Attention Values of Advertisements on Different Pages 








PAGE MEN WOMEN TOTAL 
2 18 39 30 
3 26 38 32 
4 23 42 31 
5 32 .76 ; 52 
Opposite 

Funnies 31 53 41 
Back 
Page 41 2 46 





women it will be noted are not in agreement. Thus although 
a certain page may have an advantage over another based on 
the responses of the men, that advantage will not necessarily 
hold when based on the responses of the women. The attention 
values for page 5 are less reliable than any other values in 
Table III, for they are based on only one advertisement. This 
advertisement was run by one of the largest department stores 
in Turlock and was a full page advertisement. Therefore it 
may not be entirely comparable to the advertisements appear- 
ing on other pages. 

The values for the advertisements on the back page seem a 
little higher than those on other pages (with the exception of 
page 5), but they are not significantly different from them in 
all eases. Therefore, it can be concluded that there probably 
are no preferred positions in the Turlock Daily Journal. 

Table IV furnishes evidence which shows that the position 
of an advertisement on the page has no effect on its attention 
value. It will be noted that the upper left quarter of the page 
receives the lowest attention values (except for women) al- 
though not significantly lower. This is probably due to the 
fact that the figures for this position were computed on fewer 
advertisements. It is the policy of the paper to pyramid 
advertisements down from the upper right hand corner of the 
page; therefore, fewer advertisements fall in the upper left 
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quarter of the page than in any other position. It is felt by 
the writer that due to this fact the factor of copy was less ade- 
quately ruled out for this position than for any other and since 
the advertisements appearing in this position did have less 
forceful copy than those in other positions, this would account 
for its lower attention values. 


TABLE IV 
Attention Values of Advertisements in Various Positions on the Page 








POSITION MEN WOMEN TOTAL 
Lower lett Qh nncacnccsnseeesceen 32 42 35 
Lower right Qu 2.......cccc« 30 43 37 
Upper left Q. .. 19 Al 23 
Upper right Q. , 27 36 33 
Upper half page wi 25 37 30 
Lower half page 31 43 37 
Right half page . = 30 43 37 
Left half page .................. 29 Al 33 








As can be seen by reference to Table V, neither the right nor 
the left hand page has any advantage over the other in the 
attention values of the advertisements appearing thereon. In 
this case, as with all other mechanical features heretofore dis- 
cussed, the attention value is found to be practically indepen- 
dent: i.e., it is due to something other than mechanical 
features. 

TABLE V 
Attention Values of Advertisements on Right and Left Hand Pages 








MEN WOMEN TOTAL 
Right hand po ecm 27 39 33 
Left band pe occcccccnnne 23 44 33 





The attention values of advertisements appearing Thursday 
and Friday are considerably higher than those of advertise- 
ments appearing Monday and Tuesday. Table VI furnishes 
the information concerning this fact. This can be explained, 
in part at least, by the fact that it is on Thursday and Friday 
that the bargains for Saturday are advertised. Since this has 
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long been a policy of the merchants who advertise in the 
Turlock Daily Journal, its subscribers have formed the habit 
of reading the paper more thoroughly on these two days. That 
this is not entirely hypothesis was borne out by the results of 
a reader interest survey carried on in conjunction with this 
advertising survey but reported elsewhere. 


TABLE VI 
Average Attention Values of Advertisements Appearing on 
Different Days 





DAY MEN WOMEN TOTAL 








Pe 24 37 29 
ERS 19 35 29 
SS .26 50 37 
a 51 51 51 

AVOTAZO iecccssenenen 30 44 37 





As a final note may it be called to the reader’s attention that 
the attention values for the advertisements studied in this sur- 
vey have been for the most part higher when based on the 
responses of the women. This would indicate, of course, that 
women read the advertisements more carefully than the men. 
This is to the advantage of the advertiser, for it is well known 
that women make up the bulk of the buying market. It is also 
well for the advertiser to keep in mind this fact so that he will 
make an attempt to direct his advertising toward women who 
are his ultimate purchasers. 

In summary it will be recalled that the conclusions set forth 
in this paper are as follows: (1) There is no relationship 
between the’size of an advertisement and its attention value; 
(2) there are no preferred positions; (3) neither the right nor 
the left hand page possesses any advantage over the other; 
(4) the position of an advertisement on the page has no effect 
on its attention value; (5) advertisements which appear on 
Thursday and Friday have higher attention values than those 
advertisements which appeared on Monday and Tuesday; and 
(6) women tend to read advertisements more thoroughly than 
men. 
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PATERNAL OCCUPATIONAL INTELLIGENCE 
AND MENTAL DEFICIENCY 


KATHERINE PRESTON BRADWAY}! 
The Training School at Vineland, New Jersey 


INTRODUCTION 


HE intellectual hierarchy of occupations, first suggested 
by Taussig in 1912 and later validated by the use of 
mental tests during the World War, has resulted in an 

easily applied method for estimating approximate intelligence 
levels of adults. This in turn has facilitated wide-range study 
of the relation between paternal intelligence and intelligence 
of progeny, using various occupational scales in arriving at 
estimates of the former, and administration of group tests in 
the school or some other objective testing procedure for deter- 
mination of the latter. The most widely used occupational 
scales are the Taussig scale (19), and Barr’s revision of Taus- 
sig’s scale (20, pp. 66-72). Although these scales have been 
used occasionally as indicators of socio-economic status, it 
must be kept in mind that their basis is the intelligence 
required for certain occupations rather than skill required, 
preparatory education, cultural accompaniments, financial 
returns, or any other factors which might be employed. It 
must further be recognized that in studies using the scales, 
the intelligence of the mother has been ignored except insofar 
as there is a relation between intelligence of wife and husband 
due to selection. 

Studies of parent-child relationships in regard to intelli- 
gence have shown that the occupations of fathers of children 
at the two extremes of the intelligence scale tend to cluster at 
the two limits of the occupational scale, and furthermore that 
in the middle range of the intelligence scale there is a positive 

1 The author wishes to acknowledge the assistance of Dr. Edgar A. Doll, 


Director of Research, and Mr. J. Thomas McIntire, Chief Clinician, at 
The Training School. 
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relation between intelligence of child and occupational level 
of father. A brief summary of a few of these studies may 
serve to indicate the extent of these relations and the nature 
of the material. 

A representative study of the middle range or ‘‘average’’ 
child is found in the work of Haggerty and Nash (13) with 
6,688 pupils of New York rural grade schools. For six pa- 
ternal occupational groups classified according to a modified 
form of Taussig’s scale they report the median IQ’s of the 
children as follows: professional, 116; semi-skilled, 75; farm, 
91; unskilled, 89. Their data for 1,433 high school pupils 
grouped in a similar way show the same trend. The IQ 
ranges for each occupational level overlap considerably, but 
the significance of this overlapping is not reported, nor is it 
determinable since the data are not presented in such form as 
to permit the calculation of coefficients of correlation or the 
use of other statistical devices. 

Numerous investigations have been made of the occupations 
of fathers of gifted individuals. A list of the studies would 
include those of Galton (10), Ellis (7), Cattell (2), and Cox 
(6). All are in general agreement with Terman’s (20) in- 
vestigation of gifted children in California, in which the defi- 
nite superiority of occupational levels of parents of superior 
children was demonstrated. He found that fathers of 30 per 
cent of the children were in professional occupations, 50 per 
cent in semi-professional occupations, 12 per cent in skilled 
occupations, and less than 2 per cent were common laborers. 
The significance of these percentages is brought out when we 
compare them with the following percentages of comparable 
occupational groups in the population of the United States 
(census 1920): professional, 5; business and clerical, 14; 
skilled and semi-skilled, 23; agricultural, 46; unskilled, 11. 

Investigations of the economic derivation of individuals at 
the other end of the intellectual scale, the feeble-minded, have 
been less numerous than at the upper end. Richardson (18) 
reports the occupations of parents of 100 defective children in 
special classes in New Jersey, but makes no quantitative analy- 
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sis of the data. By applying the Minnesota Occupational In- 
telligence Seale (1) to the data, we obtain the following per- 
centages for each occupational class: 2 per cent for Class I 
(highest) ; 1 per cent for Class Il; 7 per cent for Class III; 
38 per cent for Class IV; 11 per cent for Class V; and 40 per 
cent for Class VI. In other words, the fathers of over 50 per 
cent of these children are employed in semi-skilled and un- 
skilled occupations, while only 3 per cent are working in a 
professional or business capacity, as compared with 32 per 
cent and 11 per cent, respectively, in a random sample of 
males of a large midwestern city (17). 

Ordahl’s (16) data on the families of 50 children at The 
Training School at Vineland 15 years ago indicated that the \ 
social status (for which paternal occupation was used as a 
criterion) of about 72 per cent was inferior, 26 per cent aver- 
age, and 1 per cent superior. However, the small number of 
subjects and the use of subjective judgment rather than a de- 
fined scale render the conclusions from these results suggestive 
rather than final. 

Paterson and Rundquist (17) appear to have made the most 
thorough investigation in the field so far. They classified the 
fathers of 823 residents of the Minnesota School for the Feeble- 

Minded, at Faribault, according to the Minnesota Occupa- 
tional Seale (1). Percentages of each occupational category 
represented, together with the corresponding percentages for 
a random sample of males in Minneapolis, Minnesota, are 
given in the last two columns of Table I. From this we see a 
definite trend for the occupations of parents of institutional- 
ized mental defectives to be at a lower level than occupations 
of a random sample of males in the general population. It 
may be necessary, however, to warn against an uncritical ac- 
ceptance of these percentages as representative in degree of 
the total feeble-minded population. It will be remembered 
that institutional populations represent only a small propor- 
tion of the estimated number of all the feeble-minded in the 
general population. Probably feeble-minded children of the 
lower social classes are more often institutionalized, propor- 
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TABLE I 


Occupational Categories of Fathers of Feeble-Minded Subjects at Vine- 
land and at Faribault, and of a Random Sample of Male 
Adults in Minneapolis, Minnesota 














VINELAND SUBJECTS dene, 1 esuineis Rigieee 
eunusanienes, Total Sia Cocos. euasecrs cnn. POP. 
CATEGORY (N —) (N a 116) (N= 123) (N — (N 5 ae 
0 % o 0 ‘oO 
I 5 0 8 0 3 
II 16 2 28 4 8 
IIlt 20 4 30 8 30 
IV 20 16 20 32 27 
Vv 15 22 10 17 27 
VI 25 57 4 39 5 




















tionately, than are those of the upper classes. Furthermore, 
the Faribault institution, being a state institution, is composed 
entirely of state wards and contains no patients supported at 
family expense. Consequently, a survey might be expected to 
show a low occupational level for parents of these patients. 
Nevertheless, we may assume that the trend of percentages for 
this institution is fairly representative of the institutionalized 
feeble-minded in general. 

In apparent contradiction to the above findings of a positive 
relation between intelligence of child and paternal occupa- 
tional intelligence is the finding of an inverse relation between 
degree of intelligence of a feeble-minded individual and occu- 
pational status of father, i.e., more of the fathers of idiots are 
in professions than are those of morons, and more of the 
fathers of morons are laborers than are those of idiots. The 
results of Paterson and Rundquist’s study showing this rela- 
tionship are given in the last four columns of Table IV and 
will be discussed later. Ordahl’s data show a similar trend: 
only 1 per cent of idiots come from homes of inferior social 
status, as compared with 68 per cent of imbeciles and 86 per 
cent of morons. 

In the studies of both Ordahl, and Paterson and Rundquist, 
the fact that idiocy is apparently less often the result of 
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heredity than is moronity is pointed out, and this tendency is 
offered as an explanation for an inverse relation. However, 
these authors were not in a position to consider the actual in- 
fluence of primary and secondary etiology. Since an adequate 
understanding of this relationship seemed to require such a 
procedure, we proposed in the present study to repeat the in- 
vestigation of Paterson and Rundquist, using subjects who are 
fairly well differentiated according to primary and secondary 
etiology. Furthermore, we deemed it desirable to check the 
results of Paterson and Rundquist with a study of subjects 
at a private institution, whose patients presumably are drawn 
from higher social levels than are those of the Faribault insti- 
tution. It is impossible to determine, in view of the limited 
data, whether this population of a private institution is a 
representative sample of the total feeble-minded or not. How- 
ever, comparison of the results with those of a state institution 
may clarify some of the tendencies concerning the occupa- 
tional background of the feeble-minded in general. 

Several scales have been devised for grading occupational 
status. In this study the Minnesota Occupational Intelligence 
Seale (1) was employed, both because of its apparent superi- 
ority to other such scales, and to permit direct comparison 
with the Paterson-Rundquist data. This scale is a revision 
of the Barr-Taussig Scale (20, pp. 66-72). It was constructed 
on the basis of the classification of 243 occupations as to their 
respective demands on intelligence, by 20 industrial psycholo- 
gists. They were guided by a standard classification of occu- 
pations into six major categories, derived from a careful study 
not only of Taussig’s classification and Barr’s extension of the 
Taussig scale, but also of objective mental test results as deter- 
mined by the army psychologists and as subsequently revised 
by Fryer (8). The six major categories comprise: (1) high 
professional and major executive positions; (2) lower profes- 
sional and business occupations; (3) technical, clerical, super- 
visory occupations; (4) skilled trades and lower grade clerical 
work; (5) semi-skilled occupations; (6) skilled occupations. 
In 88 per cent of the occupations there was 50 per cent or 
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more agreement among the judges as to classification. Brussel 
(1) found a reliability coefficient of .98 between two indepen- 
dent ratings of a list of occupations. 


SUBJECTS 


The Training School at Vineland is a private institution for 
the feeble-minded, but receives both state and private wards. 
State wards comprise about two-thirds of the population. 
However, there is a tendency for the social level of the state 
wards to be superior to those of similar patients committed to 
the several state institutions in New Jersey, since the classifi- 
cation policy tends to favor children of the better homes for 
assignments of state pupils to The Training School. The state 
wards are for the most part limited to feeble-minded children 
of chronological ages between 5.0 and 16.0, having mental ages 
of not less than 4 years or IQ’s less than 35. Within these 
limits some selection is practiced, however, as trainable pa- 
tients are accepted in preference to non-trainable ones. More 
elasticity is permitted in regard to private cases, the IQ limit 
extending as low as 10 to 15 in some instances, and the chrono- 
logical ages being unrestricted at the upper end. 

The paternal occupations were obtained from the case history 
records on file. Children of uncertain parentage were omitted. 
Out of 525 children,? the parents of 439 were found to be 
classifiable as regards paternal occupation. Both entrance 
1Q’s* and present IQ’s were recorded. However, present Stan- 
ford Binet IQ’s were used throughout the investigation in 
order to avoid the comparison of scores based on different 
revisions of the Binet, and to obtain relatively final grading 
of the subjects in IQ terms. 

The subjects were further differentiated as to primary or 
secondary etiology. The diagnosis of etiology of feeble- 


2 The term ‘‘children’’ herein refers to the mentally deficient subjects, 
regardless of life age. 

3 In case of adults, 14 years was used as the upper limiting life age in 
computing IQ’s. The limiting life age for IQ’s used in Paterson and 
Rundquist’s study is not reported. 





INTELLIGENCE AND MENTAL DEFICIENCY 533 


mindedness for each child is routinely determined in this in- 
stitution after a thorough consideration of history data and 
clinical observations. An individual whose etiology is classi- 
fied as primary is one for whom there is no evidence of post- 
natal causation and whose family background plausibly indi- 
eates hereditary transmission of deficiency. An individual 
whose etiology is classified as secondary is one for whom there 
is a definite and plausible history of post-natal causation. Of 
the 439 subjects in this study, the etiology of 116 had been 
diagnosed as primary, and of 123 as secondary. In 200 cases 
the etiology was unknown or mixed. 


RESULTS 


The total distribution of the Vineland children according 
to occupational status of fathers is given in the second column 
of Table I. Similar distributions for the Faribault children 
and of a random sample of male adults in Minneapolis are 
found in the fifth and sixth columns, respectively, of the same 
table. 


As was anticipated, the higher occupational classes provide 


a much larger proportion of the Vineland distribution than 
they do of the Faribault distribution, 41 per cent and 12 per 
cent, respectively, for the three highest classes combined. The 
Vineland population is, in fact, more comparable with the 
distribution for the random sample of adults, 41 per cent of 
the latter also being in the three highest occupational classes. 
However, there are five times as many Vineland parents classi- 
fied in the lowest occupational group as there are adults of the 
random sample. The latter result may in part be due to the 
presence of state wards of the less favored social groups. 

This bimodality of the distribution, apart from possible 
selective influences, suggests that more than one factor may be 
operative. A consideration of the children separated accord- 
ing to primary and secondary etiology substantiates this sug- 
gestion. These data are found in the third and fourth columns 
of Table I. The fathers of 57 per cent of the children of 
primary etiology belong to the unskilled labor class (Group 
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V1), while the fathers of 4 per cent of those of secondary eti- 
ology are in this occupational group. In contrast with this, 
2 per cent of the primary subjects are in the professional or 
semi-professional categories (Groups I and II. combined) 
while 36 per cent of the secondary subjects are in these same 
two groups. In brief, compared to the random sample, the 
secondary group is superior in occupational classification, the 
primary group is markedly inferior, while the total group is 
similar. 

From these results we observe two influences at work, one 
based on sampling and the other on classification. The Fari- 
bault population probably represents subjects drawn predomi- 
nantly from relatively low social strata, while the Vineland 
population represents relatively higher strata. The distribu- 
tion of percentages of occupational classes for the economi- 
cally unselected feeble-minded would probably fall somewhere 
between these two extremes. However, since feeble-minded 
subjects of primary etiology show such an unmistakable trend 
toward the lower end of the scale, despite the probability that 
the Vineland population is apparently drawn from at least 
the average primary feeble-minded of New Jersey, we may 
venture the assumption that the incidence of hereditary feeble- 
mindedness is limited predominantly to the lowest three oceu- 
pational classes. 

Attention may now be transferred from relative tendencies 
to absolute tendencies within a feeble-minded group, i.e., to 
the relation between intelligence of child and intelligence of 
parent. Three ways of showing the inverse relation between 
1Q of child and paternal intelligence (as indicated by occupa- 
tional status) were recognized, namely, (1) correlation tech- 
nique, (2) calculation of median IQ’s, and (3) tables of per- 
centages. The Pearson coefficient of mean square contingency 
for IQ level of child and occupational class of parent (for 
Vineland data) was C=—.40. The similar coefficient com- 
puted from the Paterson and Rundquist data revealed a com- 
parable degree of negative relation, namely, C=—.37. Both 
coefficients indicate that there is a tendency for the fathers of 
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the children with the least intelligence to be occupied in posi- 
tions requiring the most intelligence, and vice versa. 

A further analysis was thought advisable to determine 
whether the trend was consistent from class to class. The 
median IQ’s for each occupational class are given in Table II. 


TABLE II 
Median IQ’s of Feeble-Minded Subjects at Vineland for Successive 
Occupational Categories of Fathers 





FATHER’S TOTAL SUBJECTS OF SUBJECTS OF 
OCCUPATIONAL SUBJECTS PRIM. ETIOL. SECOND. ETIOL. 


CATEGORY N Med. IQ N Med. IQ N Med. IQ 


I 20 45.0 ioe 10 45.0 
II 69 46.3 47.5 35 43.8 
III 88 47.1 é 47.5 7 45.4 
IV 88 50.8 18 55.0 46.0 
Vv 66 58.1 25 61.8 : 56.3 
VI 108 59.6 66 63.6 55.0 
































It will be noted that as the occupational class becomes lower 
(Group I being high) the median IQ for that class rises. The 
median IQ’s for the primary cases are similar in trend to those 
for the total group. The secondary cases have similar IQ’s 
in the four highest classes and show a rise of 10 points in IQ 
in the two lowest classes. 

Because of the possibility of unreliability due to the small 
number of cases in each class, medians for combinations of 
classes were found and are recorded in Table III. The same 


TABLE III 
Median IQ’s of Feeble-Minded Subjects at Vineland for Combined 
Occupational Categories 





SUBJECTS OF SUBJECTS OF 


’ 
FATHER 8 PRIM. ETIOL. SECOND. ETIOL. 


OCCUPATIONAL 
CATEGORY Med. IQ N Med. IQ 








47.5 45 43.8 
III and IV ......... 2: 54.2 61 45.6 
V and VI ........ 62.9 17 56 
I, II and IIL ..... 48.8 82 45 
IV, V and VI... 62.6 41 51 
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trend is even more definitely apparent in the primary cases 
and is brought to light also in the secondary cases. The Fari- 
bault data are not reported in sufficient detail to permit the 
calculation of accurate medians. 

In order to further compare our results with the Faribault 
study, percentages were computed from the Vineland data for 
each occupational class at each of four IQ levels. These re- 
sults are included with the Faribault results in Table IV. 

In the Vineland primary group 61.5 per cent of the 40-59 
IQ’s and the 60 + IQ’s combined fall in the sixth occupational 
elass. Only 5.5 per cent of these two groups combined fall in 
the first, second and third occupational classes combined. 
These results are comparable with the Faribault results, in 
which 55 per cent of the 41-60 IQ’s and the 61+ IQ’s com- 
bined fall in the sixth occupational class, and only 8.5 per cent 
fall in the first, second and third classes combined. 

On the other hand, in the Vineland secondary group the 
occupational distributions for the 40-59 IQ’s and the 60+ 
IQ’s assume more symmetrical curves, being slightly skewed 
toward the higher occupational classes, the central tendency 
falling at the third class rather than between the fifth and 
sixth classes. 

The occupational distribution for the 20-39 IQ’s in the 
Vineland primary group is also skewed toward the sixth oceu- 
pational class, but is more evenly distributed among the third, 
fourth, fifth and sixth classes, similar to the Faribault data. 
The distribution of the 20-39 IQ’s in the Vineland secondary 
group, however, is skewed even more toward the first occupa- 
tional class than is the distribution of the higher IQ’s. 

In brief: in primary cases the tendency of the distributional 
array of IQ’s is a skew toward the sixth (lowest) occupational 
class, whereas in secondary cases the tendency of the distri- 
butional array of IQ’s is a skew toward the first (highest) 
occupational class, although less pronounced than the opposite 
tendency of the primary cases. In other words, the higher 
the IQ the more marked is the skew for primary cases, and 
conversely the lower the IQ the more marked is the skew for 
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secondary cases. The occupational distribution for primary 
eases according to 1Q’s is similar to the Faribault study. 

Distribution percentages of both Vineland and Faribault* 
subjects for each occupational class according to per cent of 
IQ ranges are given in Table V. 


TABLE V 


Relation between IQ’s of Feeble-Minded Subjects and Occupational 
Categories 


: ig FP OCCUPATIONAL CATEGORIES 
36 BANGS II III IV Vv vI 
Total Vineland Subjects a 
N = 69 N = 88 N = 88 N= 66 N = 108 











7 oe 0 1 
33 26 | 18 7 

43 43 
16 | 49 





Faribault Subjects 
N = 67 N = 259 


= ~- 





32 | 37 11 
32 20 9 
24 | 29 39 





12 14 40 














As shown in Table V, 49 per cent of the total Vineland sub- 
jects in the sixth occupational class have 1Q’s of 60 or more, 
while only 8 per cent have IQ’s below 40. The same tendency 
is noted, but to a lesser extent, for occupational classes IV and 
V, while as many as 40 to 50 per cent of Classes I, II, and III 
have IQ’s below 40. That is, the higher the occupational class 
the more evenly are the cases distributed in the IQ groups, 
and the lower the occupation the more tendency there is for 
the cases to approach 60+1Q’s. This tendency, although sug- 
gested by per cent distributions of first and sixth occupational 
classes of the Faribault study is not borne out by the other 
classes, the second, third, fourth, and fifth classes all skewing 
toward 1-20 IQ’s. 


4 These percentages are not reported in the study but were calculated 
from the data. 
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Paterson and Rundquist explained the inverse relationship 
between 1Q and occupational class by the fact that ‘‘low-grade 
feeble-mindedness is caused, for the most part, by accidental, 
pathological, non-hereditary factors which would be distrib- 
uted more or less at random among the various classes of 
society, whereas simple feeble-mindedness is transmitted by 
biological heredity’’ and would therefore be found most often 
at the lower occupational levels. According to this explana- 
tion, one would expect to find IQ’s for the primary cases all 
fairly high and those for secondary cases more uniformly low, 
with no trend either up or down in variation with occupational 
class for either group considered separately. It is found, how- 
ever, that the inverse trend for primary cases taken alone is 
more marked than that for the total subjects (see Tables II and 
III), and that the two lower occupational classes for the sec- 
ondary cases result in higher IQ’s than do the four highest 
classes. Therefore, it would seem that the negative relation 
holds true not only for a total group of feeble-minded but also 
for primary and secondary cases considered separately. One 
reason for this in the Vineland group is offered. The families 
of higher occupational levels may succeed in getting children 
with IQ’s less than 35 (the lower ‘‘acceptable’’ limit) into the 
institution more easily than is found possible in the case of 
families of lower occupational levels, either by sending the 
child as a private case or through the usual channels of eco- 
nomic influence. Furthermore, families of upper occupational 
strata are more likely to keep the higher type of feeble-minded 
child in the home and, because of home supervision, state insti- 
tutional care is unnecessary, while the higher type of feeble- 
minded child of low socio-economic status is more frequently 
institutionalized by the state. These considerations were not 
taken into account in the Paterson and Rundquist study. 
Whether the separation of their cases into primary and sec- 
ondary groups would show similar trends is not known. How- 
ever, it may be that the above explanation does not entirely 
cover the facts and that there is a real inverse relation between 
occupational class and IQ when primary or secondary cases 
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are considered separately. Another study in which the above- 
mentioned selective factors are controlled will be necessary in 
order to solve this problem. If it is found that a negative or 
zero relation exists between intelligence of feeble-minded 
parents and their feeble-minded children, we may have an im- 
portant clue to the inheritance of feeble-mindedness. 


SUMMARY 


1. 1Q, etiology and paternal occupation were obtained for 
439 feeble-minded children at The Training School at Vine- 
land. The data were then classified according to paternal 
occupation in one of six categories by means of the Minnesota 
Occupational Scale. 

2. The distribution of paternal occupations for 116 feeble- 
minded subjects of primary etiology at Vineland was skewed 
toward the lowest occupational class, thus resembling a similar 
distribution for subjects undifferentiated as to etiology at 
Faribault. 

3. The distribution of paternal occupations for 123 feeble- 
minded subjects of secondary etiology at Vineland approxi- 
mated a symmetrical curve and thus resembled an occupational 
distribution for a random sample of adult males in Minne- 
apolis, Minnesota. 

4. The distribution of paternal occupations for the total 439 
feeble-minded subjects at Vineland was found to be relatively 
undifferentiated and irregular in form, and resembled neither 
the occupational distribution for the Faribault subjects nor 
that for the random sample of adult males. 

5. A negative relationship was found between IQ and pater- 
nal occupational status for all three groups: total, primary 
etiology, and secondary etiology. This was most marked in 
the primary etiology group and least marked in the secondary 
etiology group. The negative relationship may be partly 
explained by chance selective factors which were uncontrolled 
but suggests the possibility that among feeble-minded subjects 
there is not the same positive relationship between intelligence 
of parent and that of child as is found among normal subjects. 
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PREREQUISITES FOR A CLAIRVOYANCE 
HYPOTHESIS 


RAYMOND ROYCE WILLOUGHBY 
Clark University 


N a current critique (2) of Dr. J. B. Rhine’s now well- 
known monograph on extra-sensory perception (1) the 
suggestion was made that no hypothesis of clairvoyance as 

a factor in alleged extra-chance correspondences in card guess- 
ing was in order until more rigorous experiments, of a type 
there described, were carried out to determine whether the cor- 
respondences found were really extra-chance—with positive 
results. The present communication reports an experiment 
of this type; since there is no presumption of clairvoyant 
powers in either of the two subjects concerned, the results 
have slight bearing upon the clairvoyance hypothesis itself, 
but any methodological interest they may have is unimpaired 
by this fact. 

The fundamental plan of the experiment is that of Rhine’s 
“*DT’”’ work: a pack of Zener cards (25 cards, of 5 suits of 5 
cards each, with the following devices: circle, lines, plus, rec- 
tangle, star) is shuffled three times and cut once, and laid upon 
the table face down in view of the subject; the latter, who 
knows in advance the constitution of the pack, makes 25’ suc- 
cessive guesses, which are recorded by the experimenter, who 
stops him at the required number. There are no special in- 
structions except that it is suggested that the subject avoid all 
forms of logical control and guess simply whichever of the 
five forms comes into his mind. No one knows either the real 
order of the cards or the subject’s degree of success until all 
of the 25 guesses have been recorded ; in most cases the subject 
does not know these items at all, but no definite control of this 
point was introduced. 

Two variations were added to the experiment as thus out- 
lined: (1) the experimenter also served as subject by record- 
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ing his own series of 25 guesses before calling for the second 
subject’s series* ; (2) an impersonal or empirical chance series 
of ‘‘guesses’’ was introduced by shuffling a second Zener pack 
in the same manner as the first and allowing it also to lie face 
down on the table until the guesses of both subjects had been 
recorded, when it was turned up and recorded as if it were a 
guessed series by a third subject. The entire schedule for a 
**set’’ of series was therefore: (1) shuffle ‘‘target’’ pack three 
times and cut; (2) shuffle ‘‘chance’’ pack three times and cut; 
(3) record experimenter’s guesses; (4) record subject’s 
guesses; (5) record ‘‘guesses’’ of chance pack; (6) record 
target pack. The total data for the experiment consisted of 
200 of such sets of four series each. A sample set, No. 179, 
is reproduced herewith; T refers to the target series (appears 
first, but recorded last), R to the first subject’s guessed series, 
A to the second subject’s guessed series, C to the empirical 
chance series; ¢, l, p, r, s are respectively circle, lines, plus, 
rectangle, star : 








srpel Irlrp epsle peple 
slpec ripep srlle ssepl 
Irles ppslr rpeep _ aries 
rsllp rpele psree eslls 





The variables with which we shall be most concerned are the 
numbers of correspondences or ‘‘hits’’ made upon the target 
series by each real subject and by the chance series; thus by 
subject R hits are made upon the first s, the first r, the fourth 
8, the first p, the first ¢, the second p, the fourth 1, and the third 
ce of the target series—a total of 8 hits; in similar fashion sub- 
ject A makes 3 hits, and ‘‘subject’’ C 2. We thus have three 
empirical hit distributions which can be compared; to these 
we shall add what may be called the theoretical or ideal chance 
distribution, computed from the expansion of the point bi- 
nomial (p+p)", the general term of which, ,C,p'q’*, gives 


* Indebtedness is gratefully acknowledged to Mrs. A. B. Hunter, who 
kindly served as second subject. 





A CLAIRVOYANCE HYPOTHESIS 545 


the proportion of an infinite number of series of n trials each 
which will yield r successes, when p is the abstract probability 
of success in a given trial and q the abstract probability of 
failure; n is here 25, p 4, and q ¢. For present purposes the 
numbers of hits in this distribution have been rounded to the 
nearest unit in a distribution of total frequency 200, to make 
them directly comparable with the other distributions. These 
four hit distributions, with graphs (Fig. 1), follow (I being 
the ideal distribution just described) : 
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1l 37 42 28 27 22 13 
17 28 39 37 26 13 
19 43 10 14 
14 13 
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Fig. 1. Frequency distributions of hit scores for real subjects (R, A), 
empirical (C) and theoretical (I) chance; 200 trials. 
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The degree of approximation of the total (or mean) hits 
represented by these distributions to the theoretical mean 
value np, so much stressed by Rhine, will be of some interest ; 
np is here $ x 200 x 25= 1000, and the total hit frequencies for 
each series are 








* Discrepancy due to rounding of figures. 


which may be converted into mean hits per series of 25 if 
desired by dividing by 200. Following Rhine’s method the 
deviations of these values from the theoretical np figure of 
1000 are — 34, + 47, and — 54 for R, A, and C respectively, and 
the corresponding ‘‘X’’ values (deviation/p.e.) are 1.8, 2.5, 
and 2.7; these are based upon the formula p.e. = .6745 Ynpq; 
i.é., the p.e. obtained (19.0) is a function of the standard devi- 
ation of an hypothetical infinite distribution of 5000-trial hit 
scores, to which our data offer nothing comparable. The pit- 
falls involved in (1) treating the data as if they comprised one 
continuous series of 5000 guesses instead of 200 series each of 
limited distribution, and (2) calculating probabilities from 
the X values on the assumption that the relevant distributions 
are normal, have been discussed in the article mentioned; if 
these hazards are disregarded, there appear to be slight indi- 
cations that at least two of our total hit scores are different 
from chance—although the one most nearly significantly so is 
chance by definition. 

A less inadequate method of evaluation will be to study the 
distribution of deviations from chance for each subject; since 
using for this purpose any fixed value, such as 5, would result 
in a mere duplicate of the corresponding hit distribution with 
changed notation, deviations from the empirical chance value 
obtained in the given trial will be employed instead. The 
distributions are 
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a) S.. Cee 33 24 26 12 
De a ae 20 33 17 








Total 





TIRE Ritccdaniinion 1 200 
A-C is 200 





The mean of the R—C distribution is .13 and its standard devi- 
ation 2.90; the corresponding figures for the A—C distribution 
are .53 and 2.88. The standard error of the difference be- 
tween means is .29, and the difference of .40 is accordingly 
insignificant, as are the respective differences from 0. 

Perhaps the best method, however, is the chi-square test for 
goodness of fit, which seems admirably adapted to this situa- 
tion. If we use the I distribution as theoretical and the R, A, 
and C distributions in turn as observed distributions to be 
fitted by it, we obtain P’s of about .21, .32, and .29 respec- 
tively ; that is, in a large series of random fits of this sort we 
should expect fits as bad as or worse than the one under dis- 
cussion in about one-fifth, one-third, and three-tenths/of the 
cases respectively. In other words, the observed distributions 
are not significantly different from chance ones as measured 
by the ideal curve. But if we use the empirical series as theo- 
retical and fit it to the R and A distributions, we find P’s of 
approximately .00001; that is, only about once in 100,000 
trials could we expect a fit as bad as that observed or worse. 
Thus although the empirical and ideal chance series are not 
significantly different, uncompensating chance irregularities 
in the former and in a subject’s series may give the illusion of 
large extra-chance performance by the subject. It appears to 
follow that our former position should be modified to the 
extent of accepting the ideal series as the best criterion, unless 
it can be shown that a given shuffler’s empirical chance series 
form a systematically distorted hit-score distribution. 

We are still confronted by the fact of inter-individual dif- 
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ference; while it is customary to ascribe this, within limits, to 
**chance,’’ it would be desirable to reduce this agnostic area 
by isolating within it any factor that may be identifiable. In 
the present case we may test one possible factor in this area 
which received admittedly inadequate attention in the pre- 
ceding paper, namely that of partiality on the part of the sub- 
ject, or tendency to guess certain suits at the expense of others. 
A suitable measure for this tendency is the total deviation, or 
sum of the absolute deviations from 5 of the numbers of each 
suit called; thus if in a given series 5 circles are guessed, but 
7 lines, 6 plus, and only 3 rectangles and 4 stars, we may 
express the tendency to partiality by a total deviation of 
0+2+1+2+1=6. The distributions of such total deviations 








4 6 8 10 12* Total 





oo oe oe ee ee ee | 200 
. ve ae 8 3 200 





* No odd-numbered values are possible, since each positive deviation 
from 5 in one suit implies a corresponding negative deviation in another. 


These values, however, correlate negligibly with the hit scores 
(— .04 and .03 for subjects R and A respectively) and must be 
regarded as without influence upon them. Thus we are re- 
duced to the conclusion that at the present time no cause can 
be isolated as effective in varying the hit scores beyond the 
complex and unanalyzed congeries of slight causes called 
chance. 


Finally, a large number of interesting introspective data 
having some bearing upon the clairvoyance problem appeared 
in the course of the experiment; since no mention was made 
of any parallel phenomena in the original report, it may be in 
order to indicate here their general nature. In the first place, 
it was always necessary for the experimenter to stop the sub- 
ject when 25 guesses had been made; clairvoyant subjects 
would presumably stop spontaneously at the correct number, 
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and since Dr. Rhine’s monograph makes no mention of this 
point, we may suppose that this was the case with his subjects. 
Second, it was virtually impossible to avoid control from 
memory of the part of the series already guessed; here again 
our results differ from Dr. Rhine’s, since he is repeatedly 
explicit that no such control was involved in his series. Our 
generally low values for the total deviation scores are rather 
largely due, we judge, to this control, the subjects being con- 
stantly conscious (in spite of efforts to avoid it) of about how 
many of each suit had been guessed. Third, the form of re- 
port (‘‘cirele, lines, plus, rectangle, star’’), which was chosen 
so as to minimize confusion in reporting and recording, proved 
unexpectedly to produce continual confusion and blocking as 
between different sense modalities and different ideas and asso- 
ciations. One subject tended constantly to report ‘‘square’’ 
for the rectangle and ‘‘waves’’ for the lines; the other on 
different occasions had to fight off the tendency to report ‘‘tri- 
angle’’ and ‘‘minus’’ as associations to ‘‘rectangle’’ and 


‘*plus.’’ An endeavor was made to experience the associa- 


tion visually, to ‘‘see’’ the forms in succession ; notwithstand- 
ing this, many associations came in auditory form and a few 
(to the experimenter as subject) kinesthetically ; in some cases 
there was definite conflict, as when, after failure to ‘‘see’’ any 
form, the word ‘‘cirele’’ was ‘‘heard,’’ and the pencil was 
observed (with some surprise) to record the first sownd of that 
word, which however was the symbol for ‘‘star.’’ In the 
absence of information on the matter, we must assume that 
Dr. Rhine’s subjects experienced no such complexities of men- 
tal process; it may be possible, however, that some of them 
either suppressed or did not notice them, perhaps under the 
influence of the vague feeling for a balanced distribution of 
the five suits which we were unable to eliminate. 

We have failed to demonstrate any extra-chance correspon- 
dence between guessed order and actual order of cards in 
shuffled packs as a result of this brief experiment (as, indeed, 
we believed it likely that we should do, since there was no 
initial evidence for clairvoyance in either subject). We hope, 
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however, to have indicated the nature of the distributions to 
be expected when testing for ability of selected subjects to 
obtain such correspondences; and more importantly, we hope 
to have demonstrated a technique which may be regarded as 
prerequisite to the postulation of any extra-sensory hypothesis 
if violation of the principles of parsimony is to be avoided. 
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THE WILLOUGHBY TEST OF CLAIRVOYANT 
PERCEPTION 


CHARLES E. STUART 
Duke University 


reports an experiment the conditions of which he holds to 

be necessarily prerequisite to a consideration of clairvoyant 
perception as an hypothesis in card-guessing tests. In a pre- 
viously written article (4) he criticized the experiments re- 
ported by Dr. J. B. Rhine (3) as being inadequate to support 
the conclusion that an extra-sensory mode of perception exists, 
and proposed the conditions of his own experiment as being 
necessary to overcome this inadequacy. Just what conditions 
does he add to Rhine’s work ? 

(A) Willoughby requires that a subject make exactly 5,000 
guesses, 200 runs of 25 guesses each. Rhine set no limit one 
way or another upon the number of guesses recorded for a 
given subject. 

No reason is given by Willoughby why exactly 200 runs of 
25 guesses per subject should be necessary; and, of course, 
none exists. It could as well be 50 or 1,000. And however an 
exactly similar number of guesses per subject might facilitate 
recording and internal studies such as subject to subject corre- 
spondence, it has no bearing upon the question: Is there an 
extra-sensory mode of perception? This experimental condi- 
tion is wholly arbitrary and as such cannot be ‘‘necessary.’’ 

(B) Willoughby requires that the subject guess only on an 
unbroken pack without the cards being separated during the 
guessing. This was Rhine’s ‘‘DT’’ condition; but Rhine 
might use several methods at a given sitting, the results ob- 
tained in a series of experimental sessions under a given con- 
dition representing one distinct experiment. The conditions 
are the same except that Rhine’s subject might do a number 
of similar tasks between consecutive ‘‘DT’’ runs. 
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This condition, however, is a matter of laboratory conveni- 
ence, and depends in large part upon whether the experi- 
menter wishes to stress physical or psychological conditions. 
Certainly the only essentials are that the subject have no sen- 
sory contact with or inferential knowledge of the figure on 
the face of the card, and that if physical conditions are varied 
the experiments be grouped accordingly. The limitation to 
‘‘DT’’ guessing interposes a psychological rigidity that is 
certainly not ‘‘necessary.’’ 

(C) Willoughby requires a second pack other than the 
target pack to serve as a dummy, whose matched hits should 
be necessarily ‘‘chance.’’ Rhine has not done this, assuming 
that ‘‘the mathematical theory has been tested many times’’ 
(3a, p. 110); that is, that the theoretical ideal chance expec- 
tation is adequate for the purpose. 

It is difficult to see just what this variation adds to the ex- 
periment. It may serve as a chance check; that is, it may 
reassure the experimenter who might have doubts whether the 
distribution to be expected were those of a binomial expan- 
sion. But such a check can be made independently without 
being introduced into the experimental technique. It may 
serve as a shuffling check to show the possibility of extra- 
chance scores resulting from inadequate shuffling. But a much 
more relevant check would be to compare each target series 
with the one preceding. Or the chance series might serve as a 
**control’’ being defined as ‘‘chance’’ and the scores of the 
other subjects considered as variates from it. This latter was 
undoubtedly Willoughby’s original purpose in including the 
empirical ‘chance series. But while in the first method which 
he uses to evaluate his results this empirical chance series is 
used as a standard from which to measure the variations of 
the subjects, in his second method, involving the same sort of 
measurement of the subject’s performance against a standard 
of chance expectation, the empirical chance series is discarded 
and discredited! 

It is important to note just why Willoughby has discarded 
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this feature of the ‘‘prerequisite’’ upon which he insisted 
earlier. 


“é 


. . . the observed distributions are not significantly differ- 
ent from chance ones as measured by the ideal curve. But if 
we use the empirical chance series as theoretical and fit it to 
the R and A distributions, we find P’s of approximately 
.00001. . . . Thus although the empirical and ideal chance 
series are not significantly different, uncompensating chance 
irregularities in the former and in a subject’s series may give 
the illusion of large extra-chance performance by the subject. 
It appears to follow that our former position should be modi- 
fied to the extent of accepting the ideal series as the best cri- 
terion. ...’’ (Italics mine.) 


Bluntly stated the fact is that since nowhere else in the 
report is evidence offered for ‘‘uncompensating chance irregu- 
larities,’’ the empirical chance series is discarded because it 
gives significant deviations! A condition so entirely at the 
mercy of the results cannot be a ‘‘necessary’’ condition of any 
experiment. 

(D) Willoughby evaluates his results in three ways: by 


Rhine’s method of finding a critical ratio D/P.E., which he 
rejects; by comparing the human subjects, run by run, with 
the chance subject; and by the chi-square test of goodness of 
fit of the observations to the binomial expansion N(.8 + .2)”°. 

These three methods, Rhine’s method, the ‘‘less inadequate 
method,’’ and the ‘‘ best method,’’ all indicate that subjects R, 
A, and C showed no significant deviation from chance expec- 
tation in calling the cards. None of the deviations give a crit- 
ical ratio of 4 P.E. which was required in the Duke experi- 
ments. And the chi-square test showed that the frequency of 
hits per 25 was distributed fairly near to what we might 
expect from an hypothesis of chance as the only causal factor. 
Nothing in the argument or the results shows one method to 
be more ‘‘adequate’’ or ‘‘better’’ than another. The critical 
ratio method used by Rhine has been a standard procedure in 
this kind of experiment for many years. It was that used by 
Coover (1) and by Estabrooks (2) in similar experiments, and 
has the endorsement of R. A. Fisher (3b, p. 41). The chi- 
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square method of testing goodness of fit is a supplementary 
method that suggests interesting points about a given subject’s 
performance, but it is certainly not a ‘‘necessary’’ condition 
for the experiment. 

The four major points in which the Willoughby experiments 
differ from Rhine’s work have been shown to be unessential 
matters of the experimenter’s preference, having no bearing 
upon the hypothesis tested by the experiments. It must be 
concluded that Dr. Willoughby’s report neither demonstrates 
the necessity of his own method nor indicates wherein the 
Rhine experiments were less adequate. 
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DRUG ADDICTION IN ITS RELATION TO 
EXTROVERSION, AMBIVERSION, 
AND INTROVERSION 


RALPH R. BROWN 
United States Public Health Service 


N a recent article by Dr. J. B. Miner entitled ‘‘The Psycho- 
Medical Correction of the Drug Habit,’’ published in the 
Journal of Abnormal and Social Psychology (7), the rela- 

tionship between drug addiction and introversion-extroversion 
is noted. The author advocated further investigation along 
this line, believing that isolation of temperamental types might 
prove of value in more accurately prognosticating the course 
of treatment for addiction. Although the writer is in thorough 
agreement with Dr. Miner’s statement as to the value of 
temperamental type differentiation, nevertheless, from the fol- 
lowing results to be presented, this writer is unable to agree 
with McDougall’s conclusions as to the close affinity between 
introversion and the alkaloid drugs. 

Because of the unsatisfactory nature of personality sched- 
ules, rating blanks and neurotic inventories, the writer has 
attempted to determine the proportion of introverts, ambiverts 
and extroverts by means of body type differentiation according 
to Kretschmer’s classification. The justification for such an 
approach lies in the close affinity between Kretschmer’s cyclo- 
thymie and schizothymic temperament with the generally 
recognized aspects of extroversion and introversion. Although 
the writer does not believe that an absolute one to one correla- 
tion exists between the asthenic type of body build and schizo- 
thymic temperament, nevertheless, the results of Kretschmer’s 
work and other studies would seem to lead us to expect a rather 
large percentage of asthenics in any group of introverts (4). 
If MeDougall’s metabolic theory is correct, therefore, one may 
reasonably expect to find a large percentage of the lateral, or 
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microsplanchnic, or asthenic (whichever terminology one 
might prefer) type of body build. 

This study consists of one hundred sixty-two morphine ad- 
dicts admitted into Governmental Custody. Chinese, Mexi- 
cans, Mongolians and Negroes were not included in the study, 
but other than this, no selection of cases was attempted. Most 
of the subjects had been withdrawn from morphine while in 
jail awaiting transfer to the institution, although a few 
showed symptoms of recent withdrawal during the examina- 
tion. All of the measures were taken by the writer and stand- 
ard anthropometric technique was employed (8) (4). 

In Table I is presented the proportion of body types differ- 
entiated by Kretschmer’s subjective method. It will be noted 
from this table that the pyknie and athletic types definitely 
predominate, with a very small percentage of asthenics. Even 


TABLE I 
Percentage of Body Types According to Subjective Rating 
(After Kretschmer) 








PER CENT 
29 
Pyknoid 13 
CE Sa ene Ree 29 
Athletic-Asth. nebetcwipie 10 
CN EEE ED TEES 4 
Dysplastic siedeesbianie 14 
TID sccicecesccsinhsioivinesocaasinhioctoms 6 








when the dysplastic and unclassified types are placed in with 
the asthenic group, the percentage is raised only to 19 per cent, 
which is still below the percentage of pyknic body types. 


TABLE II 
Percentage of Body Types According to Wertheimer Index 








PER CENT 
40 

Athletic 50 

Asthenic 10 
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In addition to the Kretschmer subjective method, Wertheim- 
er’s Index (8) was calculated on each case. As may be noted 
in Table II, the objective method of body-type differentiation 
presents somewhat the same sort of picture. Here we have 
sixteen per cent of asthenics as compared with forty per cent 
of pyknies. If MeDougall’s hypothesis were correct, one 
should expect just the reverse of these findings. 

The graph presented herewith shows even more clearly the 
tendency of this group toward the pyknie type of body build. 
The mean Wertheimer Index number is 255.3, which would 
place the average addict close to the borderline of the pyknic 
type. Individuals with such indices are generally listed as 
pyknoids. The tri-modal appearance of this graph would in 
all probability disappear with three or four hundred additional 
cases. It seems rather improbable, however, that any material 
change would occur in the general tendency shown on the 
graph. 

Turning to McDougall’s original experiments from which 
was drawn his conclusions as to the affinity between a tendency 
towards morphinism and the introvert type, we begin to see 
reasons why objective research would not fortify his theory. 
The general method of McDougall’s experiment was as follows 
(5): The subjects were seated at a twenty or thirty degree 
angle to the plane of rotation of a small windmill. Under these 
conditions the arms of the mill appeared to reverse their mo- 
tion at short intervals. McDougall found that ether, chloro- 
form and alcohol produced a marked slowing of the rate of 
alternation, whereas morphine, strychnine, tea and coffee pro- 
duced a hastening of the rate of alternation. He further found 
that the introvert subjects experienced rapid alternations 
while the extroverts sho«ed a slow rate of alternation. He 
states, however, ‘‘The number of my subjects was far too small ; 
but the indication was that the experiment reveals the position 
of the subject in the intro-extrovert scale.’’ This work was 
done in the years 1912 to 1914 and further research has not 
borne out MeDougall’s conclusions. J. P. Guilford in a recent 
review on introversion-extroversion (2) states, ‘‘ According to 





EXTROVERSION, AMBIVERSION, AND INTROVERSION 559 


McDougall a slow rate of fluctuation indicates E (extrover- 
sion). Washburn was able to find only the slightest agreement 
of this kind. Braly and later Hunt, using three questionnaire 
tests, found only one significant correlation, and that was .44, 
between fluctuation rate and Laird’s C-2. With psychotic 
subjects rather significant differences appeared. The average 
rate of fluctuation for schizophrenics was from 4 to 6 times 
that for manic-depressives. The schizophrenics’ was almost 
identical with that for non-pathological subjects however, so 
that the deviation from the normal fluctuation rate is found 
entirely within the manic-depressive group. This fact among 
others throws doubt upon the supposition that these two psy- 
chotie groups are clear examples of I and E types, or else upon 
rate of fluctuation as an indicator of I-E.’’ (The italies are 
the writer’s.) In a later article Guilford (3) gives evidence 
to show that McDougall has been measuring only one aspect 
of introversion-extroversion by his ‘‘windmill’’ technique, 
namely, impulsiveness. Because of the admittedly small num- 
ber of cases studied by McDougall and because of the doubtful 
validity of his measure of introversion-extroversion, the writer 
is not astonished to find that McDougall’s hypothesis concern- 
ing the alkaloid drugs in their relation to introversion is not 
borne out by further investigation. 

Both subjective and objective anthropometric measurements 
as presented in the above two tables would seem to indicate that 
constitutionally the temperament of the addicts tends very 
slightly toward extroversion, although until many more mea- 
sures are taken and the problem studied from different angles, 
the writer would hesitate to make this statement as a final con- 
clusion. Because of the socio-economic stress to which most 
addicts are subjected, it is quite possible and even probable 
that any extrovertive tendencies would be markedly inhibited. 
For this reason one would expect to find in studies making use 
of rating scales and personal inventories a tendency to swing 
in the opposite direction, that is, toward introversion. The 
social condemnation levelled against users of morphine neces- 
sarily brings on an introverted perspective. A study bearing 
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directly on this point was made by D. P. Wilson, U. 8. Public 
Health Service (9) who gave, among other tests, the Neymann- 
Kohlstedt Diagnostic Test for Introversion-Extroversion, and 
the Bernreuter Personality Inventory to 216 morphine addicts 
with a retest after an interval of from three to seven months. 
In Table III is presented the Neymann-Kohlstedt results on 
the original and retest group as presented by Wilson in the 
Appendix of his unpublished report. Keeping in mind the 


*TABLE III 
Frequency Tables and Percentile—Original and Retest 
Kohlstedt 





SCORES ORIGINAL 





> 


22-up 
20-21 
18-19 
16-17 
14-15 
12-13 
10-11 
8-9 


3 
2 
1 
3 
3 
8 
10 
7 


ry 
fm) 





* (Taken from Appendix A, Table No. 1, Page 52). (9) 
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fact that plus or minus scores of from 0 to 10 indicate ambi- 
version, it may be seen from this table that the average subject 
on the original test is in the ambivert group, but tends slightly 
toward introversion. On the retest the subjects show an even 
lessened tendency toward introversion, as attested by the drop 
of the mean one interval toward zero. 

Turning to the Bernreuter Personality Schedule, we see in 
Table IV that the average morphine user in Wilson’s group 
falls into the 46th percentile on the original group test and in 
the 40th percentile on the retest : 


*TABLE IV 
Bernreuter Frequency Tables 
(Original and Retest group) 


SCORE RANGE 


ORIGINAL 
4 3 
5 


PER CENT RETEST 


4 
14 
24 
34 
44 


_ 


1 
9 
5 
2 
1 
9 
7 
1 
9 


© 


64 

74 

84 

— 94 
— 95 to -104 6 
-105 down ........ 4 


* (Taken from Appendix A, Page 55). (9) 


Bernreuter calls his B-3-1 ‘‘A measure of introversion- 
extroversion.’’ Persons scoring high on this scale tend to be 
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introverted; that is, they are imaginative and tend to live 
within themselves. Scores above the 98 percentile bear the 
same significance as do similar scores on the B—I-N scale. 
Those scoring low are extroverted; that is, they rarely worry, 
seldom suffer emotional upsets, and rarely substitute day- 
dreaming for action.’ Wilson’s results would seem to indi- 
cate that either the morphine user at this penitentiary is rela- 
tively stable emotionally or that the Bernreuter personality 
schedule is not a very discriminative tool. Clinical observa- 
tions seem to point to the latter conclusion, although definite 
statements must await further research. 

Wilson (9) does not accept Bernreuter’s data but states in 
his report, ‘‘In the experience of the author there has been 
considerable doubt as to whether the Bernreuter B—3-1 mea- 
sured introversion in any regard. Personal records of hun- 
dreds of cases, both men and women, seem to indicate that the 
weights are faulty, or else our concepts of neurotic tendencies 
are faulty.’’ 

The use of personality schedules and rating blanks in our 
attempts to understand the motivating factors concerned in 
drug addiction is quite limited and the use of such scales may 
lead to erroneous conclusions. In the first place, only a 
selected group are suitable for subjects, inasmuch as good 
intelligence is a prerequisite. The 216 men selected by Wilson 
for his personality studies came from the highest class of in- 
mates in the institution. Naturally, therefore, one must be 
very wary of accepting general conclusions on the basis of 
studies conducted on a highly selected grop. In addition to 
this deficiency to be found in the personasty schedule, there 
is also the well-recognized tendency to answer in the most 
socially acceptable manner, many of the subjects attacking the 
test as though it were an ethical discrimination test rather than 
a personality inventory. Under suitable conditions and for 
certain purposes the personality schedules have their uses, but 
much progress awaits the development of more objective means 
of personality discrimination, such as is indicated by anthro- 
pometrie technique and by the work of Darrow (1), who is 
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attempting to differentiate neurotic types of behavior by means 
of securing an integrated picture of changes in the autonomic 
nervous system due to particular types of stimulation. 


CONCLUSIONS 


1. MeDougall’s hypothesis concerning the close affinity be- 
tween introversion and the alkaloid drugs is not borne out by 
anthropometric studies reported. 

2. The average morphine addict coming within the purview 
of observation falls into the pyknoid group. 

3. The Bernreuter Personality Schedule and the Neymann- 
Kohlstedt Introversion-Extroversion Schedule are of limited 
scope in connection with studies on drug addiction. 


REFERENCES 


Darrow, C. W., AND Heatu, L. I. Reaction Tendencies Relating to 
Personality, published in Studies of the Dynamics of Behavior, 
by Lashly, Stone, Darrow, Landis and Heath, pp. 59-261. 

. GuitForD, J. P. Introversion-Extroversion. Psychological Bulletin, 
Vol. 31, No. 5, 1934, pp. 331-349. 

. GuILForD, J. P., anD GurLrorD, Ruta B. An Analysis of the Factors 
in a Typical Test of Introversion-Extroversion. Journal of 
Abnormal and Social Psychology, 1934, XXVIII, No. 4, pp. 337- 
339. 

. KrerscuMer, E. Physique and Character. 

- McDoveaLL, WiLuiAM. Outline of Abnormal Psychology, pp. 441- 
449. 

. McDovuegaLL, Witt1AM. A Chemical Theory of Temperament Applied 
to Introversion and Extroversion. Journal of Abnormal and 
Social Psychology, 1933, 1929, 24, pp. 293-309. 

Miner, J. B. The Psycho-Medical Correction of the Drug Habit. 
Journal of Abnormal and Social Psychology, XXVIII, 2, pp. 
119-122. 

. WERTHEIMER, F. I., AND HESKETH, FLORENCE E. A Minimum Scheme 
for the Study of the Morphologic Constitution in Psychiatry. 
Archives of Neurology and Psychiatry, 1927, Vol. XVII, pp. 93- 
98. 

. Witson, D. P. A Twenty Months Psychological Study of Fifteen 
Hundred Offenders of the Harrison Narcotic Laws. United 
States Public Health Service (unpublished report). 





ATTITUDE MEASUREMENT AND THE COM- 
PARISON OF GENERATIONS 


CLIFFORD KIRKPATRICK AND SARAH STONE 
University of Minnesota 


METHODS OF ATTITUDE TESTING 


HE concept of attitude has for many years played an 
Zz important part in sociological theory and in recent years 
there has been a distinct trend toward a quantitative 
study of social attitudes. There is justification for examining 
both critically and constructively some of the assumptions and 
methods that are involved in the attitude testing movements.’ 
Various classifications of attitude tests might be made de- 
pending upon which particular points of distinction receive 
especial stress.?- There is justification in drawing a distinction 
between questionnaires of the older type and the more modern 


attitude scales. A classification might be worked out as 
follows. 


I. QUESTIONNAIRES might be regarded as instruments which 
elicit informational or acceptance-rejection responses from the 
subjects and imply no general attitude continuum covered 
by the instrument as a whole in terms of amount or degree. 
The instrument may call for (A) acceptance responses indi- 
eated by yes, circles or check marks, (B) rejection responses 
indicated by no, crossing out of items or x signs, (C) some 
combination of acceptance and rejection responses, (D) 
descriptive essay responses, (E) informational responses. 

II. Scaues in contrast to questionnaires as defined above 
call for subject responses ultimately expressible in terms of 
amount or degree. 

(A) Rating Scales are the simplest form of attitude scale. 

1 See Bain, Read, Theory and Measurement of Attitudes and Opinions, 
Psychological Bulletin, Vol. 27, No. 5, May, 1930, pp. 357-379. 


2 Droba, D. D., Methods Used for Measuring Public Opinion, Am. Jr. 
Soe., Vol. 37, 1931, pp. 410-423. 
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They may (1) take the form of an adjective scale or (2) a 
numerical scale. The former would be illustrated by a request 
for the subject to indicate his attitude toward religion as 
favorable, neutral or hostile. The numerical scale would be 
illustrated by a request that the subject indicate his attitude 
toward religion by choosing a point somewhere on a scale 
of 1-5. 

(B) Un-Standardized Propositional Scales are similar to 
adjective scales but differ in that degrees of attitude variation 
are indicated by phrases or sentences rather than mere adjec- 
tives. The constructor of such a scale might decide arbitrarily 
that the proposition, ‘‘Religion is the noblest expression of 
the human spirit,’’ should receive a scale value of 5 while the 
proposition, ‘‘Religion is the epitome of ignorance, stupidity 
and hypocrisy,’’ should receive a scale value of 1. Milder 
propositions might receive intermediate scale values.* 

(C) Standardized Propositional Scales differ from (B) in 
that the scale value which a proposition receives is decided 
by sigma weights or the consensus opinion of judges who assist 
in preparing the scale. The judges may indicate their esti- 
mates of the proper numerical scale values of the propositions 
in three ways. (1) They may rate each proposition as to its 
favorableness or unfavorableness on a numerical scale. For 
example the proposition, ‘‘ Religion is all right in its place,’’ 
might receive on the average a rating of 3 by the judges on 
a five point scale. (2) A number of propositions might be 
submitted to judges to be ranked by each judge in the order 
of their favorableness. The average numerical ranking which 
a proposition receives constitutes its scale value. (3) A more 
laborious method of standardization is the method of paired 
comparisons in which each proposition is paired with every 
other proposition as to favorableness toward the issue under 
consideration. The rankings thus derived from the various 
judges may be averaged to give a numerical scale value to 
each proposition. (4) A refinement of the rating method 

8 Likert demonstrates a pragmatic justification for such methods. See 


Likert, Rensis, A Technique for the Measurement of Attitudes, Archives 
of Psychology, No. 140, 1932. 
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mentioned above (C,) has been worked out by L. L. Thurstone 
and his associates which is known as the method of equal- 
appearing-intervals. This method will be analyzed in the next 
section of this paper. 

(D) The Belief Pattern Scale Method has been worked out 
by the writers on the basis of different assumptions and 
attempts to combine the advantages of a single numerical 
score with the possibility of a configurational analysis. This 
method will be described in greater detail in the latter part 
of this paper. 


THE THURSTONE METHOD 


The Thurstone Method of equal-appearing-intervals has 
been presented as a scientific method for the actual measure- 
ment of attitudes in essentially the same sense as measure- 
ment is carried in the natural sciences.* It must be granted 
that Professor Thurstone and his associates have made a bril- 
liant contribution to psychological and sociological method- 
ology and have devised perhaps the most widely used method 
of attitude scale construction. The writers feel, however, that 
there are assumptions made by Thurstone, especially in his 
later work, which require justification and that there are alter- 
native possibilities which need further consideration. Certain 
questions can be raised which seem fundamental in the whole 
problem of attitude measurement. 

(1) Is there not an important distinction between measure- 
ment as the term is used by Thurstone and measurement given 
a somewhat more precise definition? It would seem that a 
legitimate and useful distinction can be drawn between atiri- 
butes, qualitative variables and quantitative variables. Attri- 
butes may be defined as single entities which are either present 
or absent and not usefully conceived of in terms of amount or 
degree. A person is married or is not. Death did or did not 
take place. Such attributes of course may be counted from 
the point of view of frequency of incidence but they are not 


4 Thurstone, L. L., Attitudes Can Be Measured, Am. Jr. Soc., Vol. 33, 
No. 4, Jan., 1928, pp. 529-554. 
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in themselves variables. Qualitative variables may be defined 
as expressions of ordinal relationship in terms of more or less. 
Quantitative variables are multiples or sub-multiples of units 
which are conceptually equal and interchangeable without 
effect on the relevant implications of the derived numerical 
expression. It may be argued that the term measurement 
could usefully be restricted to the derivation of quantitative 
variables by the process of counting units which are equal and 
interchangeable with reference to the purpose at hand and 
which are added to yield multiples of such units. Measuring 
the population of a room to determine the number of chairs 
required is measurement in the above sense. The people in 
the room differ but they are essentially interchangeable from 
the point of view of the seating problem. A body weight of 
150 pounds is essentially a multiple expressing the number of 
pound weights that would balance a particular human body. 

It should be noted that measurement in the sense of the 
derivation of a quantitative variable may be direct or indirect. 
The person of 150 pounds weight may push a spring scale to a 
point marked 150 pounds. Here the movement of a needle on 
a dial is an index of weight. Counting of people is measuring 
population directly by deriving multiples of units which are 
themselves parts of the thing measured. A death rate on the 
other hand, to cite one more example, may be an index of the 
health conditions in a particular locality. 

The unit counted whether used in direct or indirect measure- 
ment may be either natural or artificial. A person is a 
natural unit in measuring population. An industrial acci- 
dent, on the other hand, may be artificially defined as an 
injury necessitating the loss of at least a day’s work. A ton- 
mile is more definitely artificial. A degree, a calorie or an erg 
are thoroughly artificial units of temperature, heat and energy. 
Here effects produced, constitute units in the indices of the 
‘*thing in itself.’’ 

In the third place it may be noted that in the process of 


5 Sub-multiples of one unit would of course be multiples of a smaller 
unit in a defined ratio-equivalent to the larger unit. 
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measurement as here defined scales may or may not be used. 
If used, a scale may be employed as (1) a conceptual instru- 
ment of classification or (2) a physical instrument of observa- 
tion. In the former sense a scale is a continuum used in the 
classification of observations. Men may be grouped in a fre- 
quency curve with reference to a scale or continuum of height. 
A scale as an instrument of observation is apparently regarded 
by Chapin® as the sine qua non of measurement in contrast to 
counting or enumeration. It may be urged, however, that this 
distinction, while perhaps useful, is not fundamental and is 
actually likely to blur the basic distinction between quantita- 
tive and qualitative variables. The use of a scale as an instru- 
ment of observation in what we have called indirect measure- 
ment through indices is simply a means of counting units by 
groups rather than singly. On a tape measure, a thermometer 
or a speedometer, numbers are used to facilitate the deriva- 
tion of unit multiples. An inch or foot could be turned end 
over end to measure distance or intervals on a thermometer 
could be counted separately but attached numbers which desig- 
nate unit groups further the process of counting. The count- 
ing of the index units may be by recording as in the case of 
a clinical thermometer or by matching as in measuring human 
height with a tape measure. Professor Chapin has usefully 
distinguished a variety of measurement but it would seem that 
his criterion of an external scale of reference separates the 
similar and identifies the dissimilar, as for example, microm- 
eter readings and estimates of social distance. 

The derivation of qualitative variables, it may be contended, 
is simply ordinal classification, in terms of more or less of 
qualities that are not expressed in interchangeable units. Per- 
son A is more beautiful than person B or more hostile to pro- 
hibition. A certain operation B may be described as hurting 
less than operation A but more than operation C. Scholastic 
work is rated medium rather than excellent or poor. An 

6 Chapin, F. 8., The Meaning of Measurement in Sociology, Publication 


of the American Sociological Society, Vol. XXIV, No. 2, May, 1930, pp. 
89-91. 
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article of furniture might be rated as intermediate between a 
chair and a table. In all of these instances ordinal variation 
and ordinal relationships are implied by numbers or by adjec- 
tives. Qualitative rather than quantitative variables are de- 
rived under two conditions. (1) Psychic conditions and proc- 
esses are numerically described by qualitative variables when 
multiples of external and interchangeable units are not de- 
rived as indices by an objective counting process. When the 
pain of a dental extraction is expressed as a qualitative vari- 
able in a rating of 4 on a scale of 5, it is probable that no 
psychic unit of pain is involved as an aid to the rating process. 
Even if imaginary psychic units were counted by the subject 
interchangeability of such units would be questionable in the 
absence of observable external counterparts of the psychic 
units. A gram is a conceptual unit of measurement but the ex- 
istence of external physical counterparts in the form of gram 
weights insures the comparability of grams as conceived by 
different persons. Chapin has notable insight into the nature 
of measurement but in the opinion of the writers the criterion 
of an external and interchangeable unit could well replace the 
criterion of an external scale of reference. On the basis of 
the present analysis a rating of degree of pain associated with 
tooth extraction is a qualitative variable. The number of 
screams of a patient is a quantitative variable which is a rough 
index of pain. Screams can be added to yield a multiple and 
are relatively interchangeable in view of their existence as 
stimuli to the senses of more than one observer. 

In the second place qualitative variables come into existence 
when a comparison is made between unanalyzed configura- 
tions. Persons may be rated or classified as to degrees of 
physical maturity without analysis with reference to quantita- 
tive variables such as age, height, weight, and the like. While 
classification and counting are never separate, culturally con- 
sidered, expressions of degree tend to precede expressions of 
amount. In one sense the progress of science consists in 
translating qualitative variables into quantitative variables 
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although for purposes of condensation the process may be 
reversed. 

It may be contended that in the case of quantitative vari- 
ables there must be counting of units on the part of an ob- 
server and interchangeability of his units with those of other 
observers. The writers would define measurement as the 
derivation of quantitative variables.’ 

(2) In view of these considerations has not Professor Thur- 
stone essentially indulged in numerical classification in terms 
of more and less rather than in measurement as here defined? 
There is reason to think that a failure to recognize these dis- 
tinctions may lead to questionable inferences. In the Thur- 
stone method a proposition in regard to the Church is placed 
by judges in a pile which has a certain ordinal relationship to 
other piles of propositions. This pile has a numerical name 
and other piles in which a proposition may be placed are also 
given numerical names indicating ordinal relationships as to 
favorableness toward the Church. The proposition in turn 
receives a numerical name (scale value) which indicates a con- 
sensus of pile placement and an ordinal relationship to other 
propositions subjected to a different consensus as to pile place- 
ment. The proposition, ‘‘I do not receive any benefit from 
attending church services but I think that it helps some 
people,’’ has a numerical name (scale value) of 5.7.8 The 
number 5.7 is not a multiple of any real unit. It means little 
more than that the proposition is regarded in general as more 
or less hostile toward the Church than certain other proposi- 
tions with different numerical names. The same ordinal rela- 
tionship could have been indicated by adjectives rather than 
numbers. A person who takes a Thurstone test accepts cer- 
tain propositions and thus receives a numerical name for his 
attitude or degree of attitude. The numerical name is not a 
product of counting and could also be replaced by adjectives. 

7 Kirkpatrick, C., Statistical Studies of Personality and Personality Mal- 
adjustment, Statistics in Social Studies, Stuart Rice, Editor, University of 
Pennsylvania Press, 1930, pp. 197-216. 


8 Thurstone, L. L., and Chave, E. J., The Measurement of Attitude, 
University of Chicago Press, 1929, p. 33, also p. 61. 
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It means simply that a person from his choices inferred to be 
hostile (numerical name 10) is more hostile than if he were 
not altogether hostile (numerical name 9) and less hostile than 
if he were absolutely hostile (numerical name say 11). 

If the process, here described is merely ordinal classification 
with the aid of numerical names, then the comparison of scale 
intervals and the averaging or other manipulation of ordinal 
numbers as though they were cardinal numbers, may lead to 
danger of false or misleading inferences. The median of scale 
value six and scale value four perhaps has no more meaning 
than to average the adjectives ‘‘fair’’ and ‘‘excellent.’’ 

(3) Is a Thurstone scale score expressed in terms of equal 
and interchangeable units? The above considerations carry 
the implication that a Thurstone scale score is not a quantita- 
tive variable. There seems no reason to think that the inter- 
val between three and four constitutes a unit which is con- 
ceptually interchangeable with the interval between nine and 
ten even for a particular judge. The interval of the so-called 
continuum between proposition A and proposition B is not 
necessarily equal-appearing to all of the judges. The equal- 
appearing interval as a psychic unit, if unit it may be called, 
is not ordinarily counted in the rating process. Most judges 
assisting in the preparation of an attitude scale by the Thur- 
stone method would probably give an introspective report of 
comparison rather than of counting. Even if the intervals of 
a favorableness continuum were counted by a rater in a way 
analogous to estimating length by imagining a ruler turned 
end over end, the intervals in the minds of different persons 
could not be regarded as interchangeable in view of the lack 
of any physical counterpart of the psychic unit. A person 
might count throbs of pain and thus approach a quantitative 
variable but inter-person interchangeability would be lacking. 
Certainly a proposition might be so favored as to yield a hun- 
dred acceptances by voters. The number of persons accept- 
ing would be a quantitative variable and an index of the 
popularity of the proposition. The distinction stressed here 
seems to be more than one of degree. 
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Insofar as the judges give the same rating to a particular 
proposition they ignore configurational differences in the sense 
that according to the rules of the game laid down for them they 
are not permitted to recognize these configurational differences. 
Obviously, since scale value six is not a multiple of any abso- 
lute unit susceptible to counting, the scale value would have a 
totally different meaning if the judges were instructed to 
make seven or seventeen piles of propositions rather than 
eleven piles. There seems a real difference between a scale in- 
terval and the just-noticeable difference experienced by the in- 
dividual in psycho-physical experiments. In this latter case 
there seems to be greater interchangeability of units.° 

Again, there seems danger of false and misleading infer- 
ences. Suppose that a considerable number of men were 
rated as to height by judges, purely with reference to the 
ordinal relationship of more or less, and placed along eleven 
platforms, arranged in a height continuum. Is there any 
reason to think that the difference in average height between 
the men on platform one and those on platform two would 
correspond to the average difference in height between the 
men on platform siz and those on platform seven? It would 
seem, as a matter of fact in view of a roughly normal distri- 
bution of height (expressed in inches), that the former differ- 
ence in inch-height would be much greater. This would tend 
to be true of either heights or inherent propositional strengths 
in proportion as the judges tended to assign an equal propor- 
tion of the items to the various platforms or piles. An infinite 
number of intervals of course would tend to make such inter- 
vals approach the status of equivalent units. A rating scale 
with a billion intervals would be made up of intervals con- 
ceptually interchangeable. But few rating scales are divided 
into more than eleven intervals, and hence the configurational 

®The method of paired comparisons used in Thurstone’s earlier work 
yielding a mental unit in terms of sigma values of estimate probabilities 
seems to have more theoretical validity although perhaps less pragmatic 
justification. See Thurstone, L. L., An Eaperimental Study of Nation- 


ality Preferences, Jr. Gen. Psych., Vol. I, No. 3 and 4, July-Oct., 1928, 
pp. 405-423. 
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differences between intervals are not whisked away by ‘‘the 
ghost of vanished quantities.’’ 

(4) Is Thurstone justified in assuming an attitude con- 
tinuum? It would seem a possibility that the fundamental 
assumptions of an attitude continuum are based on an analogy 
rather than on an identity with the familiar continua which 
are quantitative variables. We can conceive of a continuum 
ranging from zero pounds to an infinite number of pounds but 
is there a continuum of furiture configurations ranging from 
a chair to a table? Chairs may gradually approach tables in 
type, but there is no simple addition of either natural or arti- 
ficial units. To say that an article of furniture has a value 
of six on the chair-table scale has no very precise meaning. 
It is quite possible likewise, that it is a configuration of atti- 
tudes known as hostility toward the church which varies to- 
ward quite a different configuration known as favorableness 
toward the church. A degree-continuum rather than an 
amount-continuum can perhaps be assumed but certain diffi- 
culties still remain. It is legitimate to abstract favorableness 
from propositions just as it is legitimate to abstract height 
from furniture but a favorableness continuum describes differ- 
ences in attitudes little better than a height continuum de- 
scribes differences in furniture. It would seem that there is 
far more to a person’s attitude pattern than a degree of favor- 
ableness. A degree-continuum can be assumed and abstracted 
but its limitations should be recognized. 

(5) Is not a seore on a Thurstone scale unnecessarily 
ambiguous? It does seem that a simple test score ignores atti- 
tudinal configurations. As a corollary there arises the possi- 
bility that two persons might obtain the same score through 
different motives. It is conceivable that a religious objector 
might obtain the same pacifism score as an atheistic socialist 
who is disturbed by economic waste. Many of the proposi- 
tions in scales prepared by the Thurstone method do not dis- 
tinguish between the emotional and the intellectual compon- 
ents of the subject’s reaction. The subject is often forced to 
accept or reject a rather confusing mixture of valuational and 
factual relationships. 
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PROPOSITIONAL ANALYSIS OF AN ATTITUDE SCALE 


The problem of balancing or disentangling evaluation and 
factual relationships in proposition items justifies some further 
illustration. Form A of Scale no. 21, known as ‘‘ Attitude 
Toward Birth Control’’ prepared by Charles A. Wang and 
L. L. Thurstone, may be analyzed with reference to this point. 
Every proposition or statement sets forth an implicational 
relationship. Let F stand for a factual entity; V for an evalu- 
ation ; VF for an evaluation fact ; and AF for an assumed fact. 
Each statement then of the Birth Control scale may be roughly 
expressed by a formula. In the analysis-formulae given below 
the dash implies an implication relationship between the two 
parts of the formula. 


1. Birth control (F) is a legitimate (A) health measure (F). F-AF. 

2. Birth control (F) is necessary (A) for women who must help earn 
a living (F). F-AF. 

3. The practice of birth control (F) may be injurious physically, 
mentally and morally (F). F-F or F-AF. 

4. We (F) simply must have birth control (VF). F-VF. (We 
implies value placed upon birth control.) 

5. The practice of birth control (F) is equivalent to murder (VF). 
F-VF. (Birth control implies an act valued as a murder.) 

6. Birth control (F) has both advantages and disadvantages (AF). 
F-AF. 

7. Only a fool (VF) can oppose birth control (F). F-VF. (Opposi- 
tion to birth control (F) implies a foolish fellow (VF).) 

8. Birth control (F) increases the happiness of married life (F). F-F. 

9. Decency (AF) forbids the use of birth control (F). AF-F. 
(Decency (AF) implies no use of birth control (F).) 

10. Birth control (F') should be absolutely prohibited (VF). F-VF. 

11. Birth control (F) is the only solution to many of our social prob- 
lems (AF). F-AF. 

13. Birth control (F) has nothing to do with morality (V). F-V. 
(Birth control implies amorality.) 

14. Birth control information (F) should be available to everybody 
(VF). F-VF. (Birth control implies desirable availability.) 

15. Birth control (F') is morally wrong, in spite of its possible bene- 
fits (V). F-V. 

16. Uncontrolled reproduction (F) leads to overproduction (F), social 
unrest (F) and war (F). F-FFF. 

17. Birth control (F) is race suicide (AF). F-AF. 





ATTITUDE MEASUREMENT 575 


18. People (F) should be free to do whatever they wish about birth 
control (VF). F-VF. 

19. The practice of birth control (F) evades man’s duty to propagate 
the race (AF). F-AF. (Birth control implies evasion of propagation 
duty.) 

20. The slight benefits of birth control (AF) hardly justify it (AF). 
AF-AF. (The slight benefits of birth control (AF) implies little justi- 
fication (AF.) 


The variation between these condensed implication patterns 
shows the heterogeneity of the propositions. It also shows that 
the acceptance or rejection of a particular proposition is am- 
biguous since either factual, evaluational or logical considera- 
tions may have motivated the acceptance or rejection of the 
statement. 


THE BELIEF PATTERN METHOD OF SCALE CONSTRUCTION 


These comments on the Thurstone method of test construc- 
tion have not been made with the assumption that there is a 
single best method of test construction. It is possible that 
there is no perfect method and no perfect instrument. It may 
be that certain methods and instruments are merely better 
than others for certain purposes. It is possible that in certain 
instances the interest of the investigator is primarily in the 
ideo-verbal aspects of attitudinal behavior while the emotional 
intensity of belief is of less importance. There are also prob- 
lems in which the attitudinal behavior to be studied has a con- 
figurational aspect. There certainly are configurated attitudes, 
as those characteristic of a good Catholic, which are the indi- 
vidual counterparts of a culture complex. The ‘‘better’’ the 
Catholic the larger the proportion of the traits of the Catholic 
culture complex which is accepted. The marginal Catholic 
tends to both accept and reject arguments and dogmas. He 
may reject tenets of his Church and also approach an agnostic 
or Protestant type by acceptance of elements of a different cul- 
ture complex. 

The method of test construction used by the writers and 
designed to yield what might be called a Belief Pattern Scale 
may have some advantages with respect to this type of problem. 
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Furthermore an attempt has been made to overcome certain of 
the disadvantages in the Thurstone method and to seek the 
following objectives. (1) Scores which are the result of mea- 
surement in that they are quantitative variables expressed as 
multiples of units which are essentially equal and conceptually 
interchangeable with reference to the purpose in mind. (2) 
Scores which are relatively unambiguous in that they have a 
common sense meaning as multiples of entities which can be 
counted. (3) A method of construction which would make 
possible a configurational analysis by reaction categories which 
would give insight into the motives which determine the selec- 
tion of particular propositions as well as the single numerical 
score which over-condenses the attitude picture. (4) More 
complete sampling of the universe of ideo-verbal behavior char- 
acteristic of the attitude pattern. A scale with only twenty 
propositions might yield a very high reliability and internal 
consistency by virtue of the fact that many aspects of the sub- 
ject’s attitude pattern are not taken into account. (5) A re- 
cording of inconsistencies inherent in the attitudinal pattern 
which reflect culture conflict and marginal culture status. 

A tentative Belief Pattern Scale was devised to analyze and 
measure the religious attitudes of educated groups. Proposi- 
tions were formulated which were characteristic of both re- 
ligious and irreligious patterns of thought. Each proposition 
contained the essence of an argument belonging to some cate- 
gory of defense or attack on religion. These propositions, 134 
in number, were submitted to 11 judges for classification. Each 
judge was presented with the propositions typed on cards and 
with a chart arranged to indicate various categories of classi- 
fication. The judge first placed each card to indicate a classi- 
fication of the argument as for or against religion. He then 
classified the pro and the con arguments as based on social, per- 
sonal, cosmological or epistomological grounds. Thus each 
argument or proposition is placed by each judge in one of eight 
possible categories. Since the intent was to obtain arguments 
which would be conceptually equal and interchangeable to 
serve as units it was necessary to eliminate those arguments 
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which were so extreme in working that they could not be re- 
garded as comparable elements in an argument configuration. 
The judges, therefore, were asked to classify each of the propo- 
sitions in the eight piles as ‘‘mild,’’ ‘‘normal’’ and ‘‘strong’’ 
with reference to vigor of wording. 

The propositions were accepted for the final version of the 
test when (1) all of the judges made the same classification 
as to favorableness of the proposition, (2) at least 75 per cent 
of the judges agreed as to argument category placement and 
(3) at least 75 per cent of the judges gave the same rating as to 
strength of wording. On the basis of these criteria a test form 
was prepared consisting of 70 propositions of which 35 were 
pro and 35 con. There were 14 social arguments, 6 personal, 
12 cosmological and 38 epistomological arguments equally di- 
vided in each instance between the pro and the con argument 
groups. A personal-social data sheet was attached to each test 
blank. 

The score on the Belief Pattern Scale consisted of the alge- 
braic sum of the favorable and unfavorable propositions which 
were accepted. The maximum religious score was plus 35 and 
the maximum irreligious score minus 35. The checking of an 
equal number of pro and con propositions gave a neutral score 
of zero. The assumed continuum thus ranged from —35 
to + 35 as expressed in terms of accepted arguments positive or 
negative. The index of attitude is a derived quantitative 
variable. 

Certain defects are fairly obvious in this tentative scale 
form. The propositions do not fall in equal proportions in the 
various argument categories and certain categories contain so 
few propositions as to be lacking in significance. The test is 
considerably longer than the average test prepared according 
to the Thurstone method. In spite of this disadvantage it was 
filled in by college students in about 20 minutes and this in- 
terval included the time spent filling in the rather extensive 
personal data sheet. It may be argued that the criteria for the 
selection of propositions were arbitrary but at least the criteria 
were formulated in objective terms. It is also true that the 
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Belief Pattern Scale method stresses intellectual components 
of attitudinal behavior and is perhaps especially restricted in 
its applicability to better educated subjects. Emotional inten- 
sity of belief is perhaps less well recorded than on a Thurstone 
type of test and in the original tentative form of the Belief 
Pattern Scale, valuation and factual relationships were as 
much mixed as on the Attitude Towards Birth Control Test 
previously analyzed.*° Perhaps the most serious objection to 
the method of test construction here described is the fact that 
the units only approximate equality and interchangeability. 
The criterion for ruling out atypical items from the point of 
view of strength of wording is admittedly arbitrary. The pro- 
portion of propositions eliminated would vary directly with a 
refinement of the judges’ atypical ratings beyond the three 
classifications which they were permitted to make. At least in 
the method of scoring used there is a counting process and a 
very simple manipulation of multiples of units which are rela- 
tively interchangeable with reference to the purpose in mind. 
The score can be given a common sense interpretation with 
reference to actual entities, namely arguments of comparable 
wording. It is a derived quantitative variable. 

While theoretical considerations are important the value of 
an attitude test also depends on the consistency of the responses 
with similar or different kinds of responses under conditions in 
which consistency might be expected. The degree to which 
human responses evoked by a particular method are consistent 
with a similar kind of responses evoked by the same method at 
another time under conditions assumed to be the same is a 
measure of the reliability of that method and of the results 
derived from it. When responses evoked by a particular 
method and assumed to have a certain meaning are consistent 
with another kind of response assumed to have a similar mean- 
ing, the method and its results are spoken of as having validity 
rather than reliability. The distinction is purely one of degree 
depending on the similarity of the two kinds of response.” 

10 The Belief Pattern Scale is omitted for lack of space. 


11 Kirkpatrick, C., Report of a Research into the Attitudes and Habits 
of Radio Listeners, Webb Publishing Co., St. Paul, 1933, p. 16. 
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As a check upon the reliability of the test, a group of 90 stu- 
dents were given the Belief Pattern Scale for religious atti- 
tudes as a retest after an interval of two weeks. The mean 
difference between the scores on the two series of tests was 3.46. 
The greatest discrepancy between two scores for any particular 
student was 12 points on a scale of 70. Only two persons devi- 
ated in their scores by more than ten points, and six individuals 
made identical scores on both tests. The correlation method 
yielded rho of + .94 + .009. The 115 students who were church 
members made a mean score of +5.72; the 98 non-members, 
—7.35. The difference between the two means amounted to 
13.07 points and the standard deviation of the difference be- 
tween the two means was 1.98. Scores correlated with what 
might be called a more tangible evidence of religiosity also 
yielded significant differences. Students who reported not at- 
tending church in the two months preceding the experiment 
made an average score of — 7.62; those who attended from one 
to three times, — 1.00; those who attended from four to six 
times, 10.44; and those who attended from seven to nine times, 
13.96. 

The validity of a test is supported if the segment of the re- 
ligious behavior sampled corresponds to the larger pattern of 
denominationalism. Scores should differ widely between con- 
servative and liberal groups. Religious ‘‘experts’’ including 
three Lutheran ministers, one Lutheran Theological Seminary 
student, one Methodist minister, an Executive of the Federal 
Council of Churches of Christ, the Director of the Institute 
for Social and Religious Research and the Dean of the Yale 
Divinity School agreed in ranking Catholics and Lutherans 
as more conservative than Congregationalists or Unitarians. 
On the basis of Leuba’s study and common sense considera- 
tions psychologists were selected as a group representing a 
still further degree of religious radicalism or perhaps agnos- 
ticism.’* Ministers were regarded as most representative of 

12 Leuba, J. H., The Belief in God and Immortality, Open Court Pub- 


lishing Co., Chicago, 1921, pp. 219-280. See also Leuba, J. H., Religious 
Beliefs of American Scientists, Harpers, Aug., 1934, pp. 291-301. 
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religious opinion and therefore the names of such religious 
leaders were sampled from the city directories of Minneapolis, 
St. Paul, Duluth, Milwaukee, Detroit, Cleveland, Cincinnati, 
Los Angeles and Boston. Of the 219 blanks sent to Catholic 
priests and Lutheran ministers (66 to priests and 153 to min- 
isters) 42 blanks were returned or a yield of about 19 per cent. 
Of the 144 blanks sent to the Congregational and Unitarian 
ministers (112 Congregational and 32 Unitarian) there were 
returned 52 blanks or a yield of about 36 per cent. 

For the sample of psychologists the first name on every odd- 
numbered page of the Psychological Register was taken.*. Of 
the 158 schedules sent out, 77 or approximately fifty per cent 
were returned. 

For the conservative Catholic-Lutheran group of 42 per- 
sons the score on the Belief Pattern Scale was 24.44, for the 
more liberal Congregational-Unitarian group of 52 persons 
the score was 12.30. The difference between these scores is 
12.14 with a 8.D. of the difference which is 2.2. The 77 psy- 
chologists made a mean score of —6.8 showing considerable 
religious hostility. They exceeded the liberal ministers in 
radicalism by 19.1 points. The S.D. of the difference of the 
two means amounted to only 2.45. It is obvious that the test 
discriminates sharply between groups known to differ widely 
in their pattern of religious attitudes. A pattern analysis of 
the returns by proposition categories showed the epistomolog- 
ical considerations were more significant in the scores of psy- 
chologists than in the case of the ministers. In the case of 
the ministers the proportionate weight of each category was 
more equally distributed. 


RESULTS AND IMPLICATIONS OF ATTITUDE RESEARCH 


The justification for a methodological tool depends ulti- 
mately upon its value in dealing with significant scientific 
problems. A sociological problem is significant in proportion 
as there is a definite question answered which has a range of 
implications such that the answer properly presented neces- 
sarily affects sociological thought systems present and future. 


18 Ed. by Carl Murchison, Clark Univ. Press, Worcester, Mass., 1928. 
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To have significance a research would have to deal with phe- 
nomena which have some degree of universality. There is no 
sociological phenomen n more widespread and more basic to 
all cultures than the process by which one takes over the atti- 
tudes, habits and customs of his fellows through interaction in 
primary groups. Universally men reflect the majority opinion 
of the groups with which they are identified. Going hand in 
hand with the universal tendency to conformity is the uni- 
versal tendency for change to take place from generation to 
generation. 

By way of investigating the balance between attitude con- 
formity and attitude change from generation to generation 
a study was made of the relation between the religious atti- 
tudes of college students and those of their own parents. Such 
a comparison keeps many factors of economic, social and edu- 
cational status constant since the comparison is between mem- 
bers of the same family group. The Belief Pattern Scale was 
given to 213 students at the University of Minnesota, mostly 
sophomores. In the case of 172 students blanks were also 
returned from one or both parents. 

For 94 mothers the mean score on the Belief Pattern Scale 
was 13.6 and for the 78 fathers the mean score was 11.5. The 
S.D. of this difference of 2.1 is 2.3 so the difference between 
the two parental groups is not significant. For 100 female 
students the mean score was 5.2 while for the 111 male stu- 
dents the mean score was —1.3..* The sex difference in re- 
ligiosity is 6.5 with a S.D. of the difference amounting to 1.4. 
There is perhaps a real tendency here for stronger religious 
attitudes on the part of the females. 

The mean score for 172 parents disregarding sex was 12.3 
while for a younger generation represented by their children 
the mean score was 1.8. The difference is 10.5 with a S.D. dif- 
ference of 1.5. Whether this bespeaks an age difference or a 
social trend is not absolutely certain.** Certainly the data 

14 For two students the sex could not be determined from the blanks. 

15A correlation between chronological age of parent and test score 


would throw light on this problem. The results of such a correlation are 
not available at present writing. 
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reveal no suggestion of a modern revival of religious 
orthodoxy. 

Do students reflect their family backgrounds in regard to 
religious attitudes? For 46 pairs the rho between mother’s 
score and daughter’s score was + .53 + .074. For 44 pairs the 
rho between mother’s score and son’s score was + .62 + .065. 
For 37 pairs the rho between father’s score and daughter’s 
score was + .53 + .084. For 38 pairs the rho between father’s 
score and son’s score was +.33 + .102. Disregarding sex of 
offspring the rho between mother’s score and children’s score 
is + .57 + .049 as compared with + .35 + .071 between father’s 
score and children’s score. There is a suggestion in the data 
that the influence of the mother is stronger at least on the sons 
than that of the father. 

There is a crumb of evidence for either assortative mating 
or attitudinal convergence of personality in a rho between the 
scores of the two parents which amounts to + .56 + .060. 





THE VALIDITY OF EXAMINATIONS’ 


W. L. VALENTINE anp J. E. WENRICK2 
The Ohio State Uniwersity 


ECENT trends in higher education involve a recognition 
of the inadequacy of detailed factual examinations 
alone in expressing a mastery of the skills and attitudes 

which are assumed to be the natural outcome of a course of 
study. While properly constructed fact examinations have 
a validity for a mastery of knowledge, they do not measure 
what is generally recognized to be more important—a skill 
in the application of those facts to socially or individually 
significant problems. 

As long as instructors were satisfied with factual examina- 
tions, the question of validity was not recognized. It was 
customary to make some adjustment in the marks of students 
on the basis of the instructor’s general impression of a per- 
formance which did not show up in the written examinations. 
The objective records of mastery of fact were supplemented 
and reinterpreted in the light of the instructor’s subjective 
estimate of the student’s ability to apply memorized facts. 
As soon as instructors attack the problem of constructing tests 
for an objective measure to be substituted for the subjective 
estimate of non-subject-matter mastery, the problem of 
validity becomes significant and possible of solution. 

Detailed factual examinations as ordinarily constructed pre- 
suppose a preparation in a course. The student must read 
textbooks, generally a specific book, and religiously attend 
lectures for fear he will miss some insignificant detail that 
later will be amplified into an important factor in some 

1This is a portion of a more complete study involving other attacks 
on the problem of validity. Since the publication of the complete study 


will be indefinitely delayed, this paper is offered because it has features 
of general interest. 

2 The authors take this opportunity to acknowledge the helpful criticism 
of 8. L. Pressey in the preparation of this paper. 
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examination.’ But when other objectives are taken into con- 
sideration, it becomes possible for a student to satisfy them 
frequently without preparation in a specific textbook or atten- 
dance to some special course of lectures. 

Let us suppose, for example, that one of the course objectives 
is ‘‘to develop a skill in the practical application of psycho- 
logical principles.’’ It is conceivable that an intelligent. per- 
son could possess a functional mastery of genuine principles 
without ever having had a formal course in psychology. It is 
even conceivable that he could apply principles functionally 
that he could not recognize when they were stated formally. 
It is probably more frequently true than we would care to 
admit that a student can be even more skillful than his 
instructor in this respect. For a large number of instructors, 
psychology is either a laboratory discipline or an intellectual 
exercise which has no practical application. 

In attempting to measure the mastery of some of these addi- 
tional objectives a quiz was designed and administered to 
all of the students in the beginning course. It was comprised 
of four subtests aimed at testing (a) Recognition of Prin- 
ciples, (b) Practical Application of Principles, (c) Interpreta- 
tion of Experiments, and (d) Ability to Distinguish Observa- 
tion from Inference. There were (a) 20 items; (b) 36; (c) 
53; (d) 14; total 123 items. 

The quiz, designated C-—8 in our files, was so constructed 
that memory of the specific material of the course was rela- 
tively unimportant, but an intelligent understanding of the 
viewpoints developed and the interpretations made were, we 
thought, necessary. After the quiz was given, discussed with 
the students and by the staff, there were some doubts raised 
as to whether it was not just a test of general intelligence 
rather than of psychological insight and whether it would 
not be possible for someone who had never had a course in 
psychology to do as well with it as one who had attended the 

In one recent examination of the essay type containing five questions, 


one of them referred to what appeared to the outsider, at least, to be 
an obseure and insignificant detail occurring in a footnote. 
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course. The skewed nature of the distribution of letter grades 
was the principal reason for this view. 

As a consequence of this discussion the test was adminis- 
tered to 49 freshmen in the College of Education who were 
members of a class of 385 students in a Freshman Survey 
Course, but who had had no previous formal instruction in 
psychology. 

The scores of the 49 were assigned letter grades on the basis 
of the norms established by the 750 actually registered in 
psychology. The distribution of the grades of the 49 was as 
follows: A, none; B, 1; C, 28; D, 12; and E, 8. The distribu- 
tion of letter grades for the 750 registered was A, 126; B, 
168; C, 375; D, 51; E, 30. Thus there is an obvious difference 
between the scores, but the number not taking psychology 
is so small that numerous factors, other than course experi- 
ence, may have influenced the results. 

In order to make direct comparisons between those who had 
had and those who had not had the course, a group of Educa- 
tion Freshmen, who were registered in the course, was selected 
and paired with the experimental group,‘ on the basis of 
centile rank, score on the entrance blank’ and a special 
reading score.® 

In pairing the students for the control group it was noted 
that it would be possible to obtain two control groups, one 
a littie more precisely paired on the basis of the three scores, 

4 Educational technicians are in the habit of thinking of those students 


who have had the training as the ‘‘experimental’’ group. In this study, 
however, the groups are correctly designated in the reverse of the usual 
procedure. 

5 The entrance blank score was provided us by Maurice Troyer who has 
developed a method of scoring the uniform entrance blank filled out by 
all freshmen before entering the University. Essentially it is a measure 
of cultural level and principal’s estimate of probable academic success. 
A complete description of the method of arriving at the score will be 
found in a Ph.D. dissertation, the Ohio State University Library. 

6 The reading score is from a special test devised by S. L. Pressey. 
It involves reading and comprehension, the use of a dictionary, foreign 
words and phrases, abbreviations, graph and plan reading, and English 
grammar and vocabulary. 
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but both sufficiently similar to the experimental group to 
serve as controls. These two groups are called hereafter 
Control I and Control Il. The pairings were made by one 
of us without knowledge of the performance on the quiz under 
examination. The averages are shown in Table I together with 
the average difference and the maximum difference between 
the respective controls and experimental groups. 


TABLE I 





CENTILE 0.8.0. ENTRANCE BLANK READING SCORE 


Experimental .. N=37* 70.8 20.3 53.6 
Control [ ......... N=37 70.923.7-13 203+2.5- 6 53.0 + 3.6-13 
Control II ..... N=37 72.3245.6-23 1642+4.7-12 578+ 9.0-—20 








* Incomplete records reduced the original number in the experimental 
group from 49 to 37. No bias in test performance was introduced by 
dropping the 12 incomplete cases since the median both before and after 
was 77. 


In addition to the data shown in Table I all groups were 
alike in age, sex, college, and length of residence in the 
University. 

We would conclude, therefore, that the groups are satisfac- 
torily equated in all significant respects so that the differences 
found in the scores are beyond a doubt due to the experience 
in psychology in the one case and to its lack in the other. 

The per cent distribution of total scores for these groups 
is shown in Table II. Comparison between the experimental 
group and the total registration exhibits clear-cut superiority 
for the latter group; but this superiority is due to factors 
other than psychological experience as shown by the relative 
decline of the difference when comparisons are made between 
the experimental groups and the two controls. Even here, 
however, the difference is probably significant. 

The average scores made on each of the parts of the quiz 
are shown in Table III. 

The critical ratios (CR) were computed on each of the 
differences between the average of the control and expert- 
mental groups except one (Practical Applications) where 
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TABLE II 
Per cent Distribution of the Various Letter Grades 





A Cc 





Experimental Group. ..... 57 
Cente I 2 11 
Control II ......... et 50 


16 50 








inspection showed it would not be significant. From these 
data we conclude that in the ‘‘recognition of psychological 
principles’’ and in the ‘‘interpretation of experiments’’ as 
measured by these tests there is a statistically reliable differ- 
ence. There is not, however, the magnitude of difference 
that one might hope to find. It is of the order of 20 to 25 
per cent. The tests are short and probably unreliable, but in 
this case the unreliability makes the significance of the dif- 
ference more profound. 

In the ‘‘practical application of principles’’ and in ‘‘dis- 
criminating observation from inference’’ these tests show no 
reliable difference between the groups. 

There is still a possibility that differences do exist in these 
abilities, but are obscured by faulty tests. It will be recalled 
that there was dissatisfaction with the tests in the first place. 


TABLE III 





EXPERIMENTAL CON TROL I CONTROL II 1 11* 





Mean o@ Mean o@ CR Mean ¢ CRCR 





Recognition of Prin- 

CA 7 25 15.7 2.0 6.0 16.0 2.0 6.0 0.0 
Practical Appli 

of Principles ........... 

Interpretation of Ex- 

periments cs ‘ © - az. 7: . 2 6.0 5.0 0.7 
Observation vs. Infer- 


ence ... lee , 1 108 34 26 110 45 22 0.0 





*The critical ratios in this column between the two control groups 
strengthen our conclusions. 
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We would not have been surprised had there been no reliable 
differences in any of the tests.’ 

The ideal arrangement in validity studies would be to have 
equated groups, one in and one outside the course, take the 
routine succession of quizzes and examinations from the begin- 
ning to the end of the quarter. Unless one had funds to pay 
the students who were not in the course, it is not likely that 
he could enlist the hearty cooperation of a significant number.* 
Since, however, we had already made contact with the stu- 
dents, we thought it desirable to continue the study of validity 
with whatever number we could enlist. We therefore selected 
the highest 12 of our 37 and offered them the opportunity 
of obtaining credit for the course if they received high grades 
on a series of examinations that we would prepare. They 
were told why they were selected and the possibility of avoid- 
ing an introductory course in favor of a more advanced one 
appealed to 10 of the 12. They were given textbooks and con- 
ferences and a week in which to study for the examination.’ 
Kight of the 10 ultimately finished all of the examinations 
and one of the eight was certified for credit.°° She subse- 
quently made a perfect record for her first year and was most 
active in extra-curricular functions. 

7 We are not concerned here with the validity of the quiz as a whole, 
only in its parts. For the sake of completeness and to prevent mis- 
understanding let it be observed that the median score for the experi- 
mental group is 77, the lowest 49, the highest 91, while for both of the 


controls the median is 89, the highest scores 101 and 98, the lowest 65 
and 61. 

8 The motivation of the experimental group is of extreme importance 
in experiments like these. It is entirely probable that we could have found 
reliable differences in all tests if we had made it clear that we really did 
not expect any of the experimental group to be successful. But on the 
contrary we held out every inducement to encourage the best possible 
performance. 

® These aids while from one standpoint are unfortunately necessary 
to lend verisimilitude to our offer of credit, from still another are happy 
variations from the usual technique because they present the results in 
the worst possible light instead of in the best. 

10 The process of successive elimination left a distinctly superior group. 
The average centile rank of the eight was 90. 
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We maintained the same pairings as in the previous section, 
but due to the small number of cases individual comparisons 
are probably more instructive than groupings. As before, the 
scores were reduced to letter grades on each of the subtests 
which comprised in addition to the four topics mentioned 
previously, tests of vocabulary, technical and general; tests 
of graph reading and interpretation and factual material of a 
general sort from the textbook (most of these items are para- 
phrased from the language in the text, excepting technical 
terms). 

The individual records of performance are shown in the 
following eight figures where the cumulative letter grades are 
plotted as points (A, 3; B, 2; C, 1; D, 0; and E,-1). These 
charts are duplicates of a progress chart which is kept by 
the students in the course as a routine informational and 
motivating device. They are somewhat shortened here because 
the experimental group did not take all of the quizzes that 
are ordinarily a part of the course. The weights for the 
various quizzes are proportionately the same as used in deter- 
mining the final grade in the course for the students who were 
registered. The letter grades shown on the charts and the 
straight lines indicating the extent of letter-grade spread are 
arbitrarily located and more recent information than we had 
available at the time the charts were drawn indicates that the 
A and B lines should be lowered somewhat and the line sepa- 
rating the D’s from the E’s raised. This is a more radical 
correction for regression toward the mean than is shown in 
the charts here presented, but the three lines on each chart 
are directly comparable regardless of the arbitrary limits 
selected from each letter-grade range in the determination of 
the final mark in the course. 

The final examination score is placed first and weighted 
6 quiz units and the quiz (C-8), from which the detailed results 
have already been given, is weighted 5 units and placed last. 
This arrangement gave the widest separation throughout the 
total length of the curves, so long as we maintained the 
approximate weights that were used for determining the grade 
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for the control groups. Since the final examination turned 
out to be the most differentiating instrument, the results could 
have been magnified to any proportion by more significantly 
weighting it in comparison with the other quizzes. 

The general trend is for the two controls to be relatively 
close together and the experimental case distinctly below. 
There is one case (Fig. 3) where there is no difference. This 
is the case certified for credit. Two others (Fig. 6 and Fig. 8) 
approach this condition. The case shown in Figure 6 probably 
should have been allowed credit (with a grade of C) and 
undoubtedly in an advanced class would have performed 
satisfactorily. 

Aside from the general level as indicated on the chart, the 
slope of the curves indicates the diagnostic value of the several 
instruments. The final examination is the most valid as shown 
by the diverse nature of the slopes for those who have and 
those who had not had the course. The quizzes composed of 
items regarding technical vocabulary and general factual 
material from the text-book and graph reading are not, gener- 
ally speaking, valid in the sense of discriminating between 
those intelligent students who have had the course and those 
who have not. They may be valid for different levels of 
general ability. 

The remaining outstanding characteristic of the curves is 
the broken appearance of those of the experimental group. 
The grades are not so consistent from quiz to quiz as are 
those of the control groups. 

For completeness, the median cumulative points have been 
plotted against quiz units in Figure 9. The group of 16 con- 
trols is plotted as a single curve because two curves of 8 each 
would be practically superimposed. The measure of central 
tendency was chosen because of the atypical distribution of 
scores of the experimental group. The plotted points, average 
or median, make little difference in the appearance of the 
control curve, but when the average is used for the experi- 
mental group, the curves are brought distinctly closer together 
although there still appears to be a reliable difference between 
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them. The numbers are too small, of course, for statistical 
techniques to have any significance. 
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Further light on the nature of the quiz units which are 
most diagnostic is shown by an analysis of the scores on the 
final examination. It will be recalled that ‘‘recognition of 
principles’’ and ‘‘interpretation of experiments’’ were the 
diagnostic parts and that tests in the ‘‘practical application 
of principles’’ and ‘‘discriminating observation from infer- 
ence’’ did not require any course experience. These results 
were based on 37 experimental cases and twice that many 
controls. If we had only the 8 cases upon which the more 
intensive study was made, our conclusions would have been 
the same except that we would not have had the statistical 
verification of the reliability of the difference. In view of this 
fact and in line with the larger number of cases, we infer 
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that it is possible to extend to the final examination the tech- 
nique already used with the quiz C-8. 

The final examination comprised six subtests: (a) recogni- 
tion of principles; (b) paraphrased material of the textbook ; 
(ec) definitions; (d) interpretation of experiments; (e) scien- 
tific method; (f) practical applications of principles. The 
average scores on these parts are as follows: 





a b c d e f 





pias 20.8 50.0 35.4 33.3 14.8 28.5 
RET SERIAL SER eee 22.6 53.0 41.5 40.1 18.0 29.0 
Difference ................ 18 30 61 68 32 0.5 
Per cent of Control .. 8.0 56 145 17.0 17.7 1.7 





The differences between the average and the per cent that 
the difference is of the average of the control group is likewise 
shown. These per cents have no significance except to locate 
the tests that contribute to the differentiation between those 
who have had and those who have not had the course. These 
tests are ‘‘definitions,’’ ‘‘interpretation of experiments,”’ 
‘*seientific method’’ and possibly ‘‘recognition of principles.’’ 
‘‘The paraphrased material from the textbook’’ and ‘‘prac- 
tical application of psychological principles’’ are not differ- 
entiating. Both of these tests contain much that is reason- 
able, general and common sense. On the basis of our previous 
findings with the whole group of 37 we should expect these 
results. 

It will be recalled that this group had a week in which to 
study. The relative equality of the scores for ‘‘recognition 
of principles’’ and ‘‘paraphrased material from the text’’ is 
probably explainable on this basis, coupled with the fact that 
this group as a whole stood in the top quintile in the psycho- 
logical examination. 

In summary: We have shown that a sample of the quizzes, 
considered as a whole, differentiate satisfactorily between 
those who have and those who have not had the beginning 
course. Subtests of recognition of basic principles upon which 
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the course is founded, definitions of technical terms, scientific 
method as applied to psychological problems, and interpreta- 
tion of experiments have a higher validity in this respect than 
do general facts and principles paraphrased from the text, 
graph reading and interpretation, or the practical application 
of psychological principles. These latter two tests apparently 
have a reasonableness in the correct answers that does not, for 
an intelligent person, require a course in psychology for their 
discrimination from the various incorrect answers. 

These conclusions are based on the quizzes as they now 
stand. It is probably possible though difficult to construct 
quizzes falling in the latter two categories which will have 
validity. 

The question of validity is an important one in connection 
with the move to include objectives in addition to mere mas- 
tery of detailed fact in the teaching and testing techniques 
of the beginning course. Without its consideration there is 
the danger of including tests that measure performance con- 
ditioned by other factors than those included in the course. 








A FACTOR ANALYSIS OF THE PERSONALITY 
OF HIGH SCHOOL LEADERS 


EDWIN G. FLEMMING 
New York, N. Y. 


HE published studies of leaders and leadership fall into 
two classes. The discursive articles attempt a logical 
analysis of types of leaders and the situations in which 

leaders function. The second group consists of statistical 
analyses of specific traits of physique, personality or character 
in relation to leadership as demonstrated by executive positions 
held. Both classes of articles discuss the difficulty of relating 
leadership to traits because of the obscurity of definitions. 

Although it is true that leadership may be of several kinds 
such as ability to plan with or without ability to execute, 
ability to direct others to a given goal with or without ability 
or imagination to envisage that goal in the first instance, and 
the ability to counsel others on projects which the individual 
himself does not seem to have the power to initiate; neverthe- 
less it is quite possible that there are certain traits common 
to all types of leaders. It is also possible that certain clusters 
of traits are more likely to be associated with leadership than 
some other clusters. 

Likewise it is true that it is difficult to define terms. Traits 
of personality seem to be quite elusive and many terms are 
used to mean several things in different contexts, such as the 
word personality itself. Many traits, too, are possessed or 
not possessed only relatively ; or there may be different kinds 
of qualities indicated by the same linguistic symbol as in the 
ease of modesty, which may represent reticence and shyness 
with respect to sex, the absence of vanity as related to personal 
appearance, or lack of pride in native intellectual equipment 
or attainment. 

Yet we do judge people and characterize them in one way 
or another. Our judgments of a particular moment are 
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generally understood without resorting to the necessity of 
first pedantically defining our terms. It should be quite pos- 
sible to investigate personality traits without laboring too 
much over definitions, lest progressive inquiry be ‘‘sicklied 
o’er with the pale cast of thought.’’ Binet, it is said, did 
not stop to define ‘‘intelligence’’ before he tried to measure 
it and to find out what it was. Indeed, it is partly through 
inquiry without definition, as in the case of the X ray, that 
we finally arrive at that state of knowledge when we can 
define with some degree of stability in the definition and some 
assurance of general acceptance of the description of the term. 

This study, then, has made no attempt to define terms, but 
has proceeded on the assumption that intelligent people use 
words with general accord and understanding in their basic 
meaning. If any statistical data are gathered on this assump- 
tion which show any definite tendencies of association the 
assumption is justified, since any great and various discrep- 
ancy in the understanding of the terms used would yield 
coefficients close to zero, the coefficient of chance. 

We are attempting to determine what of a large number 
of psychological traits, presumably associated with per- 
sonality, are related to ability to lead, and to see whether 
such ability is more definitely associated with certain clusters 
of traits (or ‘‘personality types’’ if you will) than with cer- 
tain other clusters. 

The criterion of leadership is based upon the positions of 
leadership or responsibility actually held by the subjects 
during the ninth, tenth and eleventh grades in the Horace 
Mann School for Girls. The senior year was not included 
because only about half of the subjects had completed the last 
year. Various positions received credit points according tc 
the following schedule. 


10 points—President of the General Association, President of the Girls 
League, Editor-in-chief of the Horace Mann Record. 

7 points—Editor-in-chief of the Mannikin, Treasurer of the Girls’ 
League, Vice-President of the General Association. 

6 points—Chairman of Manuscript, Vice-President of the Girl’s League, 
Business Manager of the Mannikin. 

5 points—Assistant Editors of the Horace Mann Record, Secretaries of 
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the Girls’ League and the General Association, Presidents of Classes, 
Chapter Leaders. 

4 points—Manager of Basketball, Feature Writers of the Horace Mann 
Record, Exchange Editor of the Horace Mann Record, Business 
Manager of the Horace Mann Record, Editors of Mannikin, Assistant 
Business Manager of Mannikin, Club Presidents. 

3 points—Manager of Hockey, Assistant Treasurer of the General As- 
sociation, Reporters of the Horace Mann Record, Managers of 
Swimming and Tennis. 

2 points—Vice-Presidents, Secretaries and Treasurers of all classes, Vice- 
Chapter Leaders, School Cheer Leader, Editors of Manuscript. 

1 point—Class Managers of Basketball, Hockey and Swimming, Class 
Cheer Leaders, Chairman of any Committee. 

4 point—Member of any Committee, Delegate to intra-school Conven- 
tion. 


The subjects were seventy-one girls of the Horace Mann 
High School for Girls, Teachers College, Columbia University, 
comprising the junior and senior classes.’ 

The teachers of these girls were given lists of forty-six 
traits and asked to check for each girl each item that could 
be attributed to the particular girl. The traits thus checked 
appear in Table 1. At least three teachers checked a list 
for each girl, while in some cases six teachers gave ratings 
for a particular girl. The average number of lists checked 
per girl was 3.6. 

In addition each girl indicated on a scale of ten the inten- 
sity of pleasant feeling that she subjectively associated with 
every other girl of her class. This gave a measure of what 
may be termed the pleasingness of the personality of each 
girl. There was an average of over thirty-five ratings for 
each girl on this factor of pleasingness. 

The teachers also on a scale of ten indicated the amount of 
personality that each girl possessed, when the term is used in 
such an expression as, ‘‘She has a great deal of personality.’’ 

The method of determining the degree of association be- 
tween leadership and each of the traits on the list was by 
means of the Thurstone diagrams for securing the tetrachoric 

1It is necessary that I acknowledge my indebtedness to Dr. Cecile 
White Flemming, director of the division of pupil adjustment of the 
Horace Mann School, who gathered the data which made thus study pos- 


sible, and to express appreciation for the cooperation of the teachers and 
girls in the school who participated in this inquiry. 
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coefficients of correlation. A study of the groupings of the 
traits was made with the aid of Thurstone’s simplified method 
of factor analysis in a manner described later.” 

The usual correlation technique was used to determine the 
relation of leadership to personality as measured by the 
teachers’ ratings, to pleasingness of personality as indicated 
by the ratings of the girls of one another, to the average 
number of traits attributed to the girls from the entire list, 
and to the average number of traits attributed or not attrib- 
uted from a selected list of eight items which appeared to 
be most significantly related to leadership as determined on 
the basis of groupings of traits after factor analysis. 


RESULTS 


The coefficient of correlation between leadership and per- 
sonality as rated by the teachers is .50. Between leadership 
and pleasingness of personality as rated by the girls the 
coefficient of correlation is .33. 

These two correlations would indicate that there is a posi- 
tive and definite relation between leadership and personality, 


the more personality the individual has the more likely he 
is to be called to positions of leadership either in terms of 
number or importance of position. And that leaders are 
likely to be pleasing to their contemporaries. But personality 
seems to be of more importance than pleasingness, which seems 
to be a healthy sign suggesting that girls of the type of those 
at the Horace Mann School do not overweight their own 
pleasant feelings when they choose or submit to their leaders. 
It is quite possible, and I should think highly probable that 
in a more random selection of the population representing a 
greater range of intelligence, experience and cultural back- 
ground these correlations would be somewhat higher. But 
there is no special reason for believing that personality would 
not count for more than the pleasingness of the individual 
leader’s personality. 

2 The factor analysis was made and the groupings of traits determined 


with data on eighty-four girls, including the seventy-one girls for whom 
only there were data on leadership. 
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Whether these results can be applied to women in general 
as well as to girls is an open question. However, the study 
of Levi shows that the transfer of leadership from junior 
to senior high school is great, so we may infer that there is 
consistency of leadership ability from adolescence into adult 
life. 

To conclude that the same relationships hold for boys and 
for men as for the girls studied is anybody’s guess. The 
probabilities are that they are likewise more or less valid for 
boys and men, since numerous studies of psychological traits 
show no or few extremely significant differences between the 
sexes on any one trait, and the differences within a sex group 
are greater than differences between sex groups. Even should 
there be more difference the correlations are high enough so 
that the tendencies shown probably would still hold for boys 
and men. 

The tetrachoric coefficients of correlation between leader- 
ship and each of the traits in the check-list presented to the 
teachers are indicated in Table 1. Perhaps the outstanding 
feature of these coefficients is that no single one of them shows 
a very high significant association with leadership. None of 
them is as high as the correlation with personality. Leader- 
ship, like personality itself, is apparently made up of a 
number of diverse elements, no one of which is of paramount 
importance in relation to the others. 

But there are four traits positively and significantly asso- 
ciated with leadership with correlations between .40 and .47. 
Liveliness, wide interests, intelligence, and being a ‘‘good 
sport’’ are more characteristic of leaders than of those who 
are not leaders. Other qualities of probable significance with 
correlations between .30 and .38 are originality, athletic 
ability, cleverness, a sense of humor, being cultured, indi- 
viduality, sociableness, power to amuse, being well informed, 
and competence. 

Worth noting are the six negative coefficients at the bottom 
of the list—smiling countenance, tolerant, courteous, good 
natured, not easily excited, and modest. While the coefficients 
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are not large enough to show any negative association of sig- 
nificance for this particular group, with the possible exception 
of modesty, it might be worth investigating them further with 
a more random sample. This would be especially worth while 
with modesty, since further analysis in this study shows its 
absence to be of some importance, and since there seem to 
be some logical and obvious reasons why leaders should be 
lacking in modesty. 

The correlation between leadership and the average number 
of traits checked for each girl is .39. It is lower than the 
four highest tetrachoric coefficients due to the fact, no doubt, 
that a number of the traits are negatively associated with 
leadership. 

Our next problem then is to determine whether there are 
any combinations of traits which would yield a highly signifi- 
cant correlation with the criterion. For this purpose the best 


TABLE 1 
Showing the Tetrachoric Coefficients of Correlation Between Leadership 
and the Traits Indicated 
Div ........ si ied ile cage ideas — 
Has wide interests . ES a a 45 
Is intelligent ....... ee 
Is & ** good Sport?’ on cccccncnn ea ae 40 


Is original Be oie a Lo AS a 
Is athletic .................. Palani! ae 
Is clever . fs PM BS ee TR A rE 
Has a sense 2 of humor. 0 as 34 
Is cultured .. RENE MELD IEE oe RP es 34 
Has individuality ep nenenhasitmenan 33 
Is sociable SE 6 SOT be ae 
Is amusing ... pala sureetbtesdheaumeasimeie — 
Is well informed acedne ie ehaaed Co ae 
Is competent. ........... eee FOS sileis A 


Has a pleasant Voice 0. . 28 
Has good pee ELE Eee aes See ES Fine 28 
Bb, PM ccertiptsrmparnscnisstissninstiorinssiong ee aoe 
Is entertaining Sa AR: al nesta 28 
Is interesting in conservation. .... bint ee 
Is tactful ............. biaicsaaininndicheasiats = 
Is talented in some field of art ... 22 
Is attractive in tapantuns ae 21 
Is adaptable 21 
Is natural, unaffected .20 
Is honest .. SERCO ESE FRE aie 19 
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Is frank 
Is helpful 
Is idealistic 
Is generous 
Is industrious 
Is fair 
Is unselfish 
Is considerate of others 
Is understanding 

Is loyal 
Is dependable 
Is sincere 
Is sympathetic asi 
Te I Se I acim 
Is neat 






































Has a smiling countenance 
Is tolerant 
Is courteous 
Is good natured 

Is not easily excited 
Is modest 

















procedure seemed to be to make an analysis of the common 
factors underlying all the traits. Accordingly the intercor- 
relations among all the characteristics were analysed by means 
of Thurstone’s simplified method and four general factors 


were found to account for the bulk of correlations in the 
original correlation matrix. The fourth factor residuals had 
a sigma of .04 which is smaller than the sigma of .069 which 
Thurstone found for his fifth factor residuals. 

While at the present stage of our knowledge of the general 
factor technique few would attempt to label general factors 
with a specific word, it is of interest to observe that the first 
general factor centered about the trait of fairness, the second 
had for its core originality, the third liveliness, and the fourth 
had for its major trait pleasant voice. 

All the factor loadings with the first factor were positive, 
but with the second, third and fourth factors some of the 
loadings were positive and some negative. To demonstrate 
the clustering of traits when there are four factors is impos- 
sible graphically, since four factors would take us into a fifth 
dimension. To circumvent this difficulty it was decided to 
group the traits on the basis of the sign of their factor load- 
ings on the theory, for example, that traits which were all 
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positively loaded with all factors formed a distinct cluster 
of qualities more or less interrelated. Likewise other traits 
would constitute clusters when their factor loadings were the 
same. 

By thus grouping the characteristics on the basis of factor 
loadings for the four factors we secured eight clusters or cate- 
gories. The procedure was as follows: Since all factor load- 
ings with the first factor were positive we had one group con- 
sisting of all the traits studied. This group was divided into 
two on the basis of the positive and negative loadings with 
the second factor. Each of these two groups was again 
divided into two parts dependent upon the plus or minus 
sign of the factor loading with the third factor. Finally the 


division of each of these four groups as determined by the 
signs of the factor loadings with the fourth factor gave us 
eight distinct groups of traits. 

Curiously enough upon examination these statistically de- 
termined clusters appear also to be quite logical groupings 
and in most cases can rather easily be given a general label. 
The groupings may, broadly speaking, be said to indicate 


types of personality, if we bear in mind that there are no 
pure types and that there is much overlapping of types. 


Trait Clusters 


The entertaining, in which all four The athletic, attractive, sociable, in 
factor loadings are positive which the first factor loading is 
positive, the second negative, and 

the third and fourth positive 


Is amusing 
Is entertaining 
Is interesting in conversation 


Is athletic 

Is attractive in personal appearance 
Is beautiful or pretty 

Has a smiling countenance 

Is sociable 


The brilliant in which the first The good fellow, in which the first 


three factor loadings are positive 
and the fourth negative 


Is competent 

Has individuality 

Is lively 

Has a sense of humor 
Is witty 


and third factor loadings are 
positive and the sevond and 
fourth are negative 


Is frank 

Is generous 

Is good natured 

Is a ‘‘ good sport’’ 

Is helpful 

Is natural, unaffected 
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The cultured-talented, in which the 
first, second and fourth factor 
loadings are positive and the 
third negative 


Is clever 

Is cultured 

Is original 

Is talented in some field of art 
Is well informed 

Has wide interests 


The just, in which the first two fae- 
tor loadings are positive and the 
third and fourth are negative 


Is fair 

Has good judgment 
Is honest 

Is idealistic 

Is intelligent 

Is understanding 
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The good neighbor, in which the 
first factor loading is positive and 
the other factor loadings negative 


Is dependable 
Ts industrious 
Is loyal 

Is modest 

Is neat 

Is sincere 

Is tolerant 

Is unselfish 


The diplomatic, in which the first 
and fourth factor loadings are 
positive and the second and third 
negative 


Is adaptable 

Is considerate of others 
Is not easily excited 
Has a pleasant voice 

Is sympathetic 

Is tactful 


The average correlations between leadership and the traits 
under each group are as follows: 


The entertaining 
The brilliant 





The cultured-talented 





The athletic, attractive, sociable ........................... 





The good fellow .. 


I III rcrnieprcreneniecccrssareenereveninmbciitosies 


The diplomatic 


This list of average correlations would seem to suggest that 
in the group of leaders studied there are not more than four 
types of leadership ability represented—the entertaining, the 


brilliant, the cultured-talented, and the just. It is also very 
interesting that the diplomatic group has generally low asso- 
ciation with leadership. 

However, since these clusters do not represent real types, 
but since the factor analysis indicates those traits which tend 
to be most highly interrelated cluster as indicated, we next 
picked from each cluster one trait that correlated most highly 
with the criterion of leadership. These traits are amusing, 
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lively, wide interests, intelligence, athletic, ‘‘ good sport,’’ not 
modest, and pleasant voice. We then checked over the lists 
again and determined the average number of these eight traits 
attributed to each girl. The score thus obtained correlated 
with leadership to the extent of .57. 

The appearance of the scatter diagram, however, suggested 
the probability of a non-linear regression. Calculation showed 
the Eta to be .66. Applying Blakeman’s formula for testing 
the linearity of regression it appears that the chances are 
ninety-two in one hundred that the relationship between 
leadership and the average of the selected eight traits is 
curvilinear. 

A closer inspection of the scatter diagram indicates that 
the possession of a large number of these eight traits is no 
guarantee of election to positions of leadership, but the person 
with relatively few of these eight traits has very little chance 
of attaining such positions. 

In conclusion, then, we may say that the method of factor 
analysis has hopeful possibilities for revealing types of leaders 
found in a particular group; that in the group studied there 
seemed to be four types of leadership ability—the entertain- 
ing, the brilliant, the cultured-talented, and the just; but that 
for the highest scores on our criterion of leadership a per- 
sonality embracing qualities from among all the types is 
necessary. The qualities basic to leadership as we found them 
are liveliness, wide interests, intelligence, good sportsmanship, 
ability to amuse, athletic prowess, a pleasant voice, and the 
absence of modesty. With these eight traits leadership is not 
guaranteed, but without a majority of them leadership is 
improbable. 

















THE RELATION OF SCHOOL MARKS TO IN- 
TELLIGENCE IN SECONDARY SCHOOLS 


ROBERT H. BURGERT 
Roosevelt Junior High School, San Diego, California 


brought together into one great organization it is neces- 
sary to develop a method of handling these large numbers 
that is efficient, economical, and scientific. In the words of 
Arthur D. Hollingshead, ‘‘The development of mass education 
and the increased demand for equal educational opportunities 
on the part of the masses within the past century have forced 
school administrators to devise some scheme that would enable 
them to teach large numbers of children in the most economical 
manner possible. . . . Courses of study and classes rather than 
needs and demands of the individual pupil became the unit of 
educational thinking. . . . This system of placing thirty or more 
pupils in the same class and teaching them the same subject 
matter presupposed a homogeneity of ability and achievement 
in reference to all subjects being taught.’"* Thus it is evident 
that there is a real need for scientific investigation in regards 
to pupil achievement and success as commonly measured. 
Controversial subjects of education have long been that of 
marks in the various subjects, the relation of marks to intelli- 
gence, and sex differences in marks. As early as 1910 Kelly 
began a series of investigations showing the unreliability of 
teachers’/marks as a basis of true measurement of achievement. 
This was followed by the investigations of Dearborn, Thorn- 
dike and others whose findings substantiate those of Kelly. 
The importance of these findings is at once apparent in the 
light of the investigations carried on at the same time in the 


B a system where all of the children of all of the people are 


1A. D. Hollingshead, An Evaluation of the Use of Certain Educational 
and Mental Measurements for Purposes of Classification. New York: 
Teachers College, Columbia University Contributions to Education No. 
302, 1928. Page 1. 
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field of individual differences. The realization of the impor- 
tance of the recognition of the individual as being the most 
important unit in the classroom has led to the organization of 
programs into unit assignments, homogeneous grouping, spe- 
cial classes, counseling and guidance departments, achievement 
and objective testing, and educational measurements. It has 
forced the recognition of individual differences in terms of 
ability and achievement. The individual rather than the sub- 
ject matter is being considered the starting point of the educa- 
tional process. The demand that has been placed upon the use 
of scientific method and the acceptance on the part of school 
administrators of the measurement movement have been so uni- 
versal that there has been little time or effort left for checking 
and evaluating the effectiveness of tests in their actual opera- 
tion as a basis of classifying pupils. 

One of the most important functions of education, and a re- 
cent development in the field of administration, is the counsel- 
ing and guidance movement. In as much as present educa- 
tional practice favors the classification of pupils it seems 
necessary to have as adequate a means as possible for predict- 
ing academic success. Thus, the aim of classification becomes 
one of bringing together pupils of like ability who seem likely 
to profit similarly from instruction given to the group. This 
possibility of prediction is of value to the counselor in group- 
ing the students into sections that most nearly fit the needs and 
capabilities of each individual student. It is therefore of 
value to both the teacher and counselor in making provisions 
for individual differences within the group. F. W. DeSilva in 
considering the relationship of homogeneous grouping to the 
guidance and counseling program has pointed out, ‘‘ Certainly 
if the guidance program is to function smoothly in a secondary 
school in which students are grouped or classified for instruc- 
tional purposes a great deal depends upon the reliability of 
the criteria for grouping and classifying the pupils. The 
better the counselor can predict the success of pupils the more 
intelligently he will be able to place the students for instruc- 
tional purposes. The better the pupils are placed the less, of 
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course, will be the maladjustment, which is one of the big aims 
of guidance.’” 

There is little agreement among authorities as to the basis 
upon which this grouping should be made. This is due, per- 
haps, to the lack of statistical agreement to justify any one 
method. In general the following factors or combination of 
factors form the basis for most plans of homogeneous group- 
ing: 1) intelligence quotients, 2) mental age scores, 3) achieve- 
ment test scores, 4) teachers’ marks or recommendations, and 
5) educational age or achievement ratios. C. C. Ross* carried 
on very significant investigations in an attempt to discover the 
relation between grade school records and high school achieve- 
ment in English, Latin, mathematics, and general average in 
all subjects. He summarizes his results by concluding that the 
best basis for predicting high school success would seem to be 
a combination of the following: intelligence ratings, as a mea- 
sure of native endowment; achievement tests, as evidence of 
the prerequisite academic achievement; and teachers’ judg- 
ments by grades, as a measure of attitudes and moral habits 
which are important factors in determining school success. 

EK. K. Fretwell* investigated eleven various tests for their 
predictive value and found that collectively they provided a 
better criteria than did any single basis. 8. S. Marzolf® com- 
bined marks obtained in grades five and six with intelligence 
test scores and found statistical agreement yielding a positive 
coefficient of correlation of 0.83. This is the highest degree 
of agreement that has been recorded by any investigation. 


2F. W. DeSilva, A Study of Measures Used in Predicting Academic 
Achievement in Grade Seven in the Junior High School. Unpublished 
Master’s Thesis, University of Southern California. 1933. Page 6. 

3C. C. Ross, The Relation between Grade School Record and High 
School Achievement. New York: Teachers College, Columbia University. 
1925. Page 70. 

4E. K. Fretwell, A Study in Educational Prognosis. New York: 
Teachers College, Columbia University Contributions to Education No. 99. 
1919. 55 pages. 

58. 8S. Marzolf, ‘‘ The Classification of High School Students,’’ School 
and Society, 32: 881-882. 1930. 
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H. L. Cronk® investigated the relation existing between school 
marks and intelligence quotients and found a low positive 
agreement. The highest coefficients were found in mathematics 
and similar subjects which leads him to conclude that the 
greater objectivity of grading in these subjects make prediction 
more accurate than in the fields of art, music, and literature 
where the basis of determining marks are highly subjective. 
It is evident from these few illustrations that a general plan 
for grouping is desirable, but as yet there is no best method. 

Related to the problem of prediction of academic success is 
the question whether or not boys and girls achieve similarly in 
school. If they do we can dispense with the problems; if not, 
we must ascertain what differences exist between the sexes and 
how these differences may be adjusted. E. A. Lincoln’s’ in- 
vestigations led him to the statement that girls are definitely 
superior in all school subjects, this difference being most ap- 
parent in the language course and in English and literature, 
while the boys make their best showing in science and mathe- 
matics, though they do not always surpass the girls as far as 
grades are concerned. F. H. Lund® differs with this attitude 
and attempts to show that the superiority of girls as generally 
conceived does not exist except in terms of type of educational 
mastery. He states that girls’ superiority depends upon the 
type of measure used. Grades or marks give girls a pro- 
nounced superiority ; but if objective tests were used the dif- 
ference was greatly reduced; or if long range educational tests 
were used the positions of the boys and girls would be reversed. 
Thus, we find the theory that girls are prone to memorize 
whereas boys understand the material more thoroughly at the 
expense of memorization of detail. If these be true it is a 
serious charge upon the usual practice of distributing grades. 


6H. L. Cronk, A Study of Prognosis on the Junior High Level. Unpub- 
lished Master’s Thesis, University of Southern California, 1933. 

7E. A. Lincoln, Sex Difference in the Growth of American School Chil- 
dren. Baltimore, Warwick and York. 1927. 

8¥. H. Lund, ‘‘Sex Differences in Type of Educational Mastery,’’ 
Journal of Educational Psychology 32: 322-323. 1932. 
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The middle road is taken by Koos® who attempts to credit the 
girls with superior work in school in terms of their earlier 
maturity. He explains this variation between sexes by saying, 
‘‘Boys and girls are not readily compared in respect to mental 
processes. . . . Earlier pubescence gives them (girls) an aver- 
age age somewhat in advance of that for boys. ... After 
the middle grades four and five girls have demonstrated 
superiority in scholarship as it is ordinarily measured in the 
schools. ...’’ This superiority he attributes to: (1) inter- 
ests, tastes, and the like; (2) greater rigidity of social control 
of girls than boys; and (3) both influences working together. 

Again we find a similar divergence of attitudes as to the rela- 
tive achievement of boys and girls. It is clear that educators 
are by no means agreed as to the cause and solution of many of 
the major educational problems of the day. 

This widespread difference of opinion on the part of so many 
investigators led the author of this article to investigate these 
problems in terms of his own school. The records of 191 
graduating ninth grade students from the Roosevelt Junior 
High School furnishes the data for this investigation. Boys 
and girls were listed separately and an average grade for each 
subject for each pupil was calculated from the data filed in 
their personnel and permanent record cards. Each average 
score was assigned to a point scale ranging from 1, equivalent 
to an ‘‘A,’’ to 11, which was the score assigned to the mark 
‘*F.’’? The intelligence quotient of each pupil was obtained 
for grade 6A and for grade 9A. The tests used were Terman 
Group Test, Form A in grade six, and the Otis Self-Adminis- 
tering Test of Mental Ability, Higher Examination, in grade 
nine. From this array of data the range, mean, and standard 
deviation for each of the groups were obtained, and by com- 
bining both groups the same factors were computed for the 
entire group. From the statistical data obtained the McCall’s 
Experimental Coefficient was calculated for grade six and 

®L. V. Koos, ‘‘ Variations among Pupil Differences Determined by 


Sex,’’ The American Secondary School. New York: Ginn and Co., 
Chapter III. Pages 101-102. 
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grade nine to measure any differences existing between boys 
and girls in native ability. The results obtained are tabulated 
in table I. 

TABLE I 


A Summary of Intelligence Quotients for All Students Entering Grade 7B 
September, 1931, and Leaving Grade 9A June, 1934 





RANGE 
SEX From To MEAN o 





Boys 76 110.00 
Girls 78 110.32 
Both 76 110.06 
Boys 83 108.25 
Girls 80 108.41 
Both 80 107.94 
Number = 191 























Thus, it is evident from a survey of the material in Table I 
that there is no statistically significant difference between the 
boys and girls when they entered school in 1931 and when they 
left in 1934, in terms of native endowment. The very low ex- 
perimental coefficient would indicate that for educational and 
instructional purposes the boys and girls formed a homogene- 
ous group both at entrance and at graduation of junior high 
school. 

In Table II is summarized the average marks received in 
certain subjects by all students considered in this study dur- 
ing their enrollment in junior high school and compared by 
the experimental coefficient method as to differences existing 
between sexes. The averages and standard deviations are 
tabulated according to a qualitative scale giving the lowest 
numerical valuation to the highest qualitative score. 

Inspection of Table II yields highly significant data. In all 
eases the girls excelled the boys at the mean and the experi- 
mental coefficient yielded a difference that favored the girls 
in every case, although statistically it was significant only in 
English, mathematics, music, art, and average achievement. 
Although, for the purpose of studying academic achievement, 
the marks of any one teacher are highly unreliable due to the 
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TABLE II 
Summary of Marks Received by 191 Junior High School Students in Class 
Subjects During Their Enrollment 1931-1934 and Measures 
of the Statistical Significance of Sex Differences 
in Marks in the Certain Subjects 





SUBJECT 





Boys 
Eengish ..ccccccceecessime Girls 
Both 
Boys 
Mathematics Girls 
Both 
Boys 
Social Science Girls 
Both 
Boys 
Gen’l. Science Girls 
Both 
Boys 
Musie Girls 
Both 
Boys 
Girls 
Both 
Boys 
Aver. Achievement ... Girls 
Both 


1S THE 
DIFFER- 
E.C. ENCE SIG- 
NIFICANT? 





5.869 
4,254 
5.479 
5.711 
4.805 
5.116 
5.461 
4.886 
5.116 
6.595 
5.708 
5.093 
5.293 
3.832 
4.386 
6.265 
4.872 
5.378 
5.856 
4.696 
5.162 


1,89 
2.19 
2.25 
2.11 
2.28 
1.67 
1,96 
1.95 
1.97 
2.37 
2.54 
2.87 
2.05 
1,84 
2.07 
1.98 
2.19 
2.16 
1,90 
1.91 
1,97 


1.49 Girls Yes 





subjectivity of the assignment of grades, it may be well to 
accept the skeptical remark of Kelly’® upon the reliability of 
marks as recorded over a period of years when he states, ‘‘ We 
should expect the average of the estimates of a dozen or more 
teachers to come pretty close to the ranking of young people. 
The pupil’s record is the most complete, detailed and accurate 
of all records of the ordinary pupil from his entrance into 
school to his entrance into work.’’ 


10 F, J. Kelly, Teachers’ Marks. New York: Teachers College, Colum- 
bia University Contributions to Education, No. 66. 1914, cited by F. W. 


DeSilva, op. cit., p. 20. 
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The second part of this investigation aimed to determine 
the value of the intelligence quotient as a means of predicting 
school success as measured by marks. For this purpose the 
records of the boys and girls were combined to form a single 
group for each subject and the data were correlated accord- 
ing to the Pearson Formula. The correlations obtained to- 
gether with the probable error for each of the several subjects 
are tabulated in Table III. 


TABLE III 
Correlations Between Average School Marks and Grade 6A 


Intelligence Quotients 





SUBJECT r, P.E. oF r. 





ESRI aes +0.51 + 0.014 
« Bocial Seiemee nc ccccccccceeceennen + 0.32 + 0.049 
. Mathematics + 0.41 + 0.038 
General Science + 0.55 + 0.039 
I ii cecadecaite itn evinssomniiicniel + 0.26 + 0.053 
MIR: cnsthsius sadlacsiaad + 0.20 + 0.063 
. Average Achievement + 0.48 + 0.040 
. Otis IQ (1934) + 0.84 + 0.010 


ee 











To summarize the facts arrived at through the inspection of 
these statistics, we find that all correlations are low and posi- 
tive. Few of them give greater than about thirteen per cent 
accuracy for prediction. This is in agreement with the find- 
ings of other investigators and substantiates the fact that as a 
means of predicting academic success as generally measured 
by grades, the intelligence quotient is unsatisfactory. We 
may assume that the combination of several factors such as 
the intelligence quotient, mental age scores, achievement 
ratios, and teachers’ judgments will provide a better means 
of predicting academic success than any one measure. 


SUMMARY AND RECOMMENDATIONS 


The differences existing between boys and girls in specific 
subjects and average achievement are marked and in most 
cases statistically significant. Girls show greater ability than 
boys in all subjects, the difference being of statistical impor- 
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tance in English, mathematics, music, art, and average achieve- 
ment. The predictive value of the Terman test for academic 
achievement ranges from a positive 0.20 to 0.55 with a posi- 
tive correlation of 0.48 for average achievement. This is not 
highly significant but it does indicate a low significant agree- 
ment. The intelligence quotient as yielded by the Terman test 
is most valuable for predicting success in English, mathe- 
matics, general science, and average achievement; and is of 
little importance for the prediction of success in art or music. 
There is little or no change in intelligence as measured in 
grade six and grade nine. The slight decrease in the range 
and at the mean may be attributed to the increased chrono- 
logical age. 

In as much as there is no difference in abstract mental abil- 
ity between boys and girls we may assume that the difference 
existing in achievement may be attributed to factors other 
than those measured by intelligence tests. It is recommended 
that careful investigation of personality traits and their rela- 
tion to intelligence and school achievement be made as a basis 
for explaining these differences. Possibly teachers need more 
objective methods for determining grades, as it would seem 
that the correlations are high in subjects of a scientific and 
objective nature whereas there is a very low agreement in the 
non-academic subjects of art and music. 

The low correlations yielded by comparing intelligence quo- 
tients with teachers’ marks would indicate that for the pur- 
pose of predicting school success a more scientific basis is 
needed. This would, perhaps, necessitate the use of a com- 
bination of several factors such as intelligence, grade school 
record, teachers’ judgments, mental age, ete. Although a 
point for point agreement has not been reached in all cases the 
intelligence quotient functions to determine whether a student 
should be reasonably expected to do satisfactory work or fail. 
Finally, it may be recommended that all unscientific methods 
of grouping and grading be discarded and that the many capa- 
ble and interested workers in this field combine their efforts to 
the solution of this problem. 





THE GOODENOUGH INTELLIGENCE TEST 
IN INDIA 


EMIL W. MENZEL 
Bisrampur, India 


\ J ERY little intelligence testing has been done in India. 
D. S. Herrick (4) published in 1921 a comparative 
study on the differences in performance on the God- 

dard Form Board of American children and the high caste 

Brahmins and low caste Panchamas in southern India. He 

found only a slight difference between the Brahmins and 

Panchamas but a considerable difference between both these 

Indian groups and American children (an average of 64 sec- 

onds’ longer time was required by Indians to complete the 

performance. ) 

Soon after this date, C. Herbert Rice (7) began his adapta- 
tion of the Stanford-Binet scale and in about 1924 published a 
manual of directions for the administration of the test in India. 
A fuller report on his experiment appeared in 1929. His stand- 
ardization is based on 929 cases. No comparison was made with 
children of other countries. The scale had to be thoroughly 
adapted to meet Indian conditions, these necessitated changes 
making later comparison with subjects of other nationalities 
difficult. A valuable comparison was made between the chil- 
dren from various castes, or social classes, in the Punjab, 
which revealed very little difference between the abilities of 
high caste and low caste children. 

With the exception of the two above-named studies and the 
several shorter articles based on Rice’s work, no further in- 
telligence testing of Indian children is reported. Rice’s 
Hindustanee scale is not receiving the attention it deserves and 
at present there seems to be little if any testing done. 

In 1932-33 the writer (5) administered the Goodenough 
‘‘draw-a-man’’ test to 2600 children ranging in age from 6 
to 20 in the Central Provinces. As the environment is so 
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totally different from that in America and the places in which 
most intelligence testing has been done, it is necessary to keep 
these differences clearly in mind when making comparisons. 
Most of the children come from primitive surroundings. The 
income of the average family probably does not exceed $100 
per year and in a high proportion of cases is less. All social 
classes were included in the experiment but the great majority 
of subjects belong to the peasant classes. All pupils present 
in school on the day the test was given are included. The 
selection of schools is a random sampling and is believed to be 
quite typical, including both government and mission schools, 
city and rural, boys’ and girls’ schools, middle and primary. 

Literacy for India is quoted in the 1931 census as being 
9 per cent (15 per cent for boys and 2.9 for girls.) In the 
school system, in the eleven grades from the first to the last 
year of high school, more than half of the pupils are enrolled 
in the first grade. Two-thirds of the pupils never get beyond 
the first grade and only 15 per cent succeed in passing the 
fourth grade. As is to be surmised, the school equipment and 
administration is decidedly poor, the mode of instruction 
stereotyped and largely by rote, and the attendance irregular. 
The intelligence of children brought up with such handicaps in 
environment and schooling cannot fairly be compared with that 
of children in more fortunate surroundings. 

The only changes made in the standardized testing and 
scoring procedures as given by Goodenough (2) was to score 
slightly more leniently the amount of clothing indicated in the 
drawing. This was thought advisable since in India a far 
seantier amount of clothing is worn than in America. This 
change would not influence the score more than one per cent. 
All pupils 15 years of age or more were entered as being 15. 

A classification of drawings, according to the age of the 
pupils who executed them, revealed a fairly regular increase 
in scores with increasing age. (Table 1.) This amounts to 
approximately two points per year, beginning with 9.45 at 
6 years and ending with 26.55 at 15 years (mean). The stand- 
ard deviation ranged from approximately 3 to 10, increasing 
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with age. Thirty-seven and seven-tenths per cent of all pupils 
were able to match the average score of pupils one year older 
than themselves while 63.2 per cent of the pupils equaled the 
scores of those one year younger. 


TABLE 1 
Scores by Ages 





CASES MEAN SCORE MEDIAN SCORE 8.D. 





PRE sectiseneensaictions 128 9.45 9.3 3.24 
Rs ae ae 244 10.45 11. 3.84 
8.5 Pe een 341 12.35 12.9 5.3 
RIE iskngnimngnneneen 344 14.1 14.3 6. 
WRU doscisinsictaisetaiiiet 321 16.85 16.9 6.75 
NUIT ncanecdiseiacctdencen 297 18.3 18.2 7.3 
12.5 276 19.45 19.1 7.8 
| | NW es 166 21.85 22.5 9.15 
SS 155 25.8 26.9 10.2 
en ee 332 26.55 27.5 9.2 





From grade to grade the increase in score was approximately 
three points, a fifty per cent greater progress than by age. 


(Table 2.) Here again the increase was fairly regular, begin- 
ning with 9.83 in the first grade and ending with 31.8 in the 
eighth grade with SD’s of 2.7 to 8.45. The percentage of 
pupils in the grade lower scoring as high as those of a given 
grade was 31.1 while 66.3 of the next higher grade made as 
high a score. 


TABLE 2 
Scores by School Grades 





CASES MEAN SCORE 





446 9.83 
565 12.63 
421 14.95 
391 17.8 
285 22.92 
275 26.55 
125 28.4 
31.8 
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This considerable discrepancy in the rate of increase in per- 
formance between age groups and grade groups is readily un- 
derstood when the extreme retardation in school is taken into 
account. The average age of the first graders is 8.6 years and 
of the second graders 10.1 years. (Table 3.) When all pupils 
retarded more than two years in school were removed from the 
groups, the average score was considerably bettered, the ad- 
vantage being as high as 4.65 points among the 14 year olds. 


TABLE 3 
Age, Grade Distribution 














GRADE 
AGE 
I 11 ur soIV v vI Vit VIII 
Oe natin 105 23 
fae 141 91 12 
| ee 105 = «151 67 816 2 
DB ccsitcien 61 133 81 39 19 1 
ae 18 oo ee wel ee. ie 
5 nn 8 36 a ee 
_ ae 8 39 44 89 54 £30 3 1 
| 28 20 48 5 33 16 3 
265 :...... 5 9 33 2 43 20 £20 
a 2 8§ 13 13 4 2 2 
6S... 3 2 5 4 31 17 
SET enieeion 1 3 Sm. 3B. me 
Be one 2 7 8 6 
ge GIS lee i Ree a a Cle 4 1 2 
_) ee 4 3 
Average Age ........... 86 101 112 124 14 149 163 16.46 
Expected Age ........... 6.6 7.6 86 96 106 116 126 13.6 





Since the primary school (first to fourth grade) is the end 
of schooling for the overwhelming majority, the middle school 
pupils are the select group. The difference in performance of 
primary and middle school pupils of the same age ranged from 
6.5 to 9.2 score points. (Table 4.) The scores of both schools 
are noticeably affected by the proportion of retardation found 
among the pupils of each grade. 





ee: 
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INTELLIGENCE TEST IN INDIA 619 


TABLE 4 


Difference in Performance of Primary School and Middle School 
Pupils of the Same Age 








13.5 





Primary Schools 
No. of Pupils 175 61 
14.3 17. 16.7 


Middle Schools 
No. of Pupils , 61 98 96 105 
Mean Score . 23.65 23.85 23.5 25. 





Average score for all pupils 16.85 18.3 19.45 21.85 





No reliable difference was found in the performance of rural 
and city dwellers. (Table 5.) However, the Central Prov- 
inces contain no really urban communities, even the largest 
cities being decidedly rural in atmosphere. Neither was there 
an impressive discrepancy between groups from the eastern 
part of the Province, which is alleged to be the more backward 
part, and the western, reputedly the more advanced part. 


TABLE 5 
Rural and City Primary Pupils Contrasted 





NO. OF RURAL MEAN NO. OF CITY MEAN DIFFERENCE IN 
SCHOOL PUPILS SCORE SCHOOL PUPILS SCORE FAVOR OF RURAL 





60 10.32 68 8.74 1.58 
146 10.04 98 10.45 — 41 
228 13. 111 12.95 05 
234 13.55 90 13.25 30 
159 15.15 102 16.6 -7 
135 16.4 64 15.1 1.3 
125 17. 50 





583 





Difference in the average of means 0.06. 
Pupils above 12 years of age were too few to permit reliable comparison. 


The difference in the performance of boys and girls is reli- 
able and impressive, averaging slightly more than a year’s 
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growth in favor of the boys. (Table 6.) The girls were also 
found to be more highly retarded in school even though they 
have the advantage of better trained teachers and better super- 
vised schools. Forty-eight per cent of the boys performed bet- 
ter than the mean score of their age group (both sexes included) 
while only 39 per cent of the girls did so. 


TABLE 6 
Performance of Boys and Girls 











NO. OF MEAN SCORE NO. OF MEAN SCORE DIFFERENCE 
AGE GIRLS FOR GIRLS BOYS FOR BOYS IN SCORE 

BUN ib scart ae tae 37 9.42 91 9.5 .08 
Sie SIF) 67 10.25 177 10.45 2 
| Re 68 11.85 273 12.65 8 
We cate 81 12.7 263 14.55 1.85 
ti aS 88 14.35 234 17.45 3.1 
RE ose os 92 17.35 205 19. 1.65 
SEE eee 91 17.5 185 20.2 2.7 
| RRsat ter 63 19.2 103 23.65 4.45 
BN cbiduidivaisnicees 56 21.6 94 27.9 6.3 
EO 25.5 134 28. 2.5 

TEI ih cicens as. 1759 

Average of Mean 
Score ........ we 15.97 18.35 

Average Difference 2.38 








This difference in the performance of boys and girls is the 
most surprising result of the study. In America the girls per- 
formed very slightly better than the boys on this scale and 
nowhere excepting in India was a reliable difference in the 
performance of the sexes reported. Unfortunately Rice tested 
only boys when standardizing the Hindustanee Binet so there is 
no further information on sex performance in India. A com- 
parison on other scales should by all means be made. If it is 
true that girls perform less well on other intelligence tests 
also, the cause of this provides an interesting field for investi- 
gation. Girls live a much more secluded type of life in the 
Orient than in the Oecident, notably so in India, and it would 
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be an interesting study to find out if there is correlation be- 
tween seclusion and poorer performance in an intelligence test. 


RELIABILITY AND VALIDITY 


On a test and retest of 99 pupils from grades 2-7, a reliability 
coefficient was obtained by the Pearson product-moment 
formula of .926 with a PE of .01. Validity for the test for use 
in Central Provinces is claimed for the reasons that a fairly 
regular increase in average scores is obtained for groups both 
of increased age and higher standing in school, and that the 
expected effect takes place on the scores when either retarded 
or accelerated pupils are removed from the group. 

When we reduce the Indian scores to I1Q’s in terms of the 
American norms for the various ages, we find that the Indians 
average a score roughly 71 per cent as high as that of the 
Americans of the same age. The reduction to an alleged IQ 
now makes possible a comparison with other races. Good- 
enough (3) has tested various nationalities and races on the 
‘*draw-a-man”’ scale and reports the following averages: 


Americans Scdiiecanacs Oe 
Armenians 92 
SSS ES eee 
ES EAE IE 
eS a 
Southern Negroes 

American Indians 





Thus our East Indian ranks slightly lower than any of these 
groups. 

When, however, the difference in environment and educa- 
tional opportunity is taken into account, the result is not 
surprising, for even the southern Negro has decidedly greater 
educational, economic, and sanitary-health advantages than 
the inhabitants of rural Central Provinces in India. When 
we compare the Indian to the pure-blood Negroes who live on 
the island of St. Helena on the coast of Georgia comparatively 
primitive circumstances, we find the Indian actually superior. 
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In Brazil also there is a close correspondence in score. It is 
to be noted that the Brazilians start with a much higher score 
but end with the same figures as the Indians. (Table 7). 


TABLE 7 
Points Scored by St. Helena (6), India, and Brazil Groups (1) 





AGE ST. HELENA INDIA BRAZIL 





7.5 10.9 

6.6 , 14.6 

9.5 , 15.1 
13.7 12.9 16.8 
15.3 14.3 19.35 
15.1 16.9 20.5 
18.9 18.2 23. 
18.6 19.1 24.46 
19.1 22.5 24.1 
19.9 26.9 23.81 
24. 27.5 27.34 






































An assumption that because the Indian child averages 71 on the 
American norms of an intelligence scale or scales, the normal 
Indian child is only as intelligent as the 71 IQ American child, 
is not justified. Before such an assumption is justified, it 
would be necessary to know to what extent inferior educational 
facilities, the prevalence of disease, undernournshment, child 
marriage and early motherhood, social institutions such as the 
caste system and seclusion of women, a philosophy of life de- 
cidedly more contemplative and fatalistic than our own, and a 
decidedly restricted play life among children, affect intelligence 
as it is measured by this test. An IQ for Indian children would 
be valid only if based upon norms established for Indian chil- 
dren. 

It is also possible that Indians perform less well on an in- 
telligence test such as the Goodenough than on some other 
type of scale. 


CONCLUSIONS 


The following conclusions are, in the estimation of the 
writer, justifiable : 
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1. The Goodenough drawing test is adaptable in India as a 
tool for the objective measurement of intelligence of larger 
groups for survey purposes, and may prove useful for use with 
individuals when supplemented by additional information. 

2. Indians do not make as high scores on this test as Ameri- 
cans of the same age. 

3. Since the differentiation of the performance of the various 
age groups is not as marked in India as in America (two points 
increase per year for the former as against four for the latter) 
the Indian scale is consequently somewhat less sensitive and 
more subject to inaccuracy. This is a difficulty all intelligence 
tests seemingly have to face when applied in a backward en- 
vironment. 

4. Tentative norms, based on 2600 cases are herewith sub- 
mitted: (See Table 8.) 


TABLE 8 
Norms for India and America 





AGE INDIA AMERICA (GOODENOUGH ) 


6.5 9 14 
75 .. 11 18 
8.5 13 22 
95. 15 26 
10.5 17 30 
11.5 19 34 
12.5 21 38 
13.5 23 42 
14.5 25 
15.5 27 






































5. Some comparisons with other racial groups tested on the 
Goodenough and other scales have been made. Though inter- 
esting, these comparisons are not held to be valid as an index of 
comparative mental power, for the great difference in environ- 
mental background and the imperfection of the instruments of 
measure do not permit a valid comparison. But they do sug- 
gest that the handicap under which the Indian pupil labors is 
a considerable one. 
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6. The difference in the performance of Indians and other 
races suggests that educational standards and practices cannot 
be imported from other countries without thoroughgoing modi- 
fication and adaptation which takes into account the handi- 
caps and advantages under which the Indian pupils labor. 
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NEWS AND NOTES 


The attention of the readers of the JouRNAL is called to ‘‘ Tables for 
Finding Point-Hour Ratios’’ compiled by Dr. Ward C. Halstead, of 
Chicago University, formerly of the Department of Psychology, Ohio 
University. The purpose of these tables is to expedite the finding of the 
point-hour ratio when a great number of such computations are to be 
made. ‘The tables are complete through 20 hours and 90 points, thus 
being applicable to both the quarter and the semester division of the 
academic year. Likewise, they may be used for both the system where 
A=3, B=2, ete., and where A=4, B=3, ete.’’ These tables are espe- 
cially useful to those engaged in research in Psychology and Education 
where the point-hour ratio is used as one variable. The tables are now 
in mimeographed form and may be secured from the JOURNAL OF APPLIED 
PsycHoLocy, Ohio University, Athens, Ohio, at the following prices: 
single copy, 50 cents; 5 to 19 copies, 30 cents each; 20 or more copies, 
20 cents. 


‘*The United States lost an educational statesman of first rank with 
the death of Dr. William John Cooper, the eighth Commissioner of Edu- 
cation,’’ stated J. W. Studebaker, present United States Commissioner of 
Education. The Office of Education was reorganized to a considerable 
extent by Dr. Cooper and its service was extended into a number of pio- 
neer fields in education. Specialists in education by radio, education of 
exceptional children, education of Negroes, tests and measurements, and 
comparative education were added to the staff of the department under 
his administration. Since his resignation as Commissioner of Education, 
Commissioner Cooper had served as professor of education at George 
Washington University. 


The Child Research Clinie of The Woods Schools, Langhorne, Pa., re- 
cently announced as the keynote of the Second Institute ‘‘The Contribu- 
tion of the Sciences.’’ Outstanding leaders in the fields of neurology, 
psychiatry, psychology and related subjects contributed to an all-day con- 
ference held at the Schools on Tuesday, October 15, 1935. Among the 
speakers were Dr. Lauretta Bender, Senior Psychiatrist in charge of the 
Children’s Ward, Bellevue Psychiatric Hospital, New York City; Dr. 
Harriet Babcock, Psychologist, of New York City; Dr. Thomas H. 
Haines, Psychiatrist to Out-Patients, New York Hospital; Dr. Ira 8. 
Wile, Psychiatrist, Children’s Clinic, Mount Sinai Hospital, New York, 
and many others. 
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Dr. M. E. Broom, formerly of the State Teachers College, San Diego, 
California, has recently become Assistant Superintendent of Schools of El 
Paso, Texas, and Editor of the Zl Paso Schools Standard, the professional 
journal of that school system. 


From a recent report made by Dr. William Lewin, Chairman of the 
Motion Picture Committee of the National Education Association’s De- 
partment of Secondary Education, it is found that sixty-three photoplays 
of educational interest will be released during the present academic year 
and that ten pictures have been tentatively selected for discussion in 
motion-picture appreciation courses. Dr. Lewin, who has just completed 
a survey of production in Hollywood, found producers, writers and direc- 
tors interested in the study of motion picture appreciation. He received 
the fullest cooperation from all officials while in Hollywood. He also vis- 
ited teachers colleges and universities and found much interest manifested 
in the photoplay appreciation movement. The University of Southern 
California, Columbia University, and New York University are among the 
score of universities that have already instituted these courses. In view 
of the upward trend in the number and quality of photoplays of interest 
to teachers and students, it is Dr. Lewin’s opinion that the picture- 
appreciation movement will be rapidly adopted in schools throughout the 
nation. Already more than 2,000 high schools are teaching new units in 
this field. 





BOOK REVIEWS 


Orpway TEeap. The Art of Leadership. New York: McGraw-Hill Book 
Company, 1934. Pp. xi+ 308. 


This is the best book on leadership up to the present. It has a com- 
mendable absence of the inspirational, autobiographical discussion and 
gives a systematic presentation of problems confronting those who are 
interested in becoming leaders or developing leaders among their subor- 
dinates. Tead’s definition of leadership is worth quoting: ‘‘ The activity 
of influencing people to cooperate toward some goal which they come to 
find desirable.’’ The last clause indicates the novel part of the defini- 
tion. 

One of the best chapters is on how the leaders influence others. Good 
examples are given of the use of suggestion and imitation. Public ex- 
hortation should be used sparingly. Persuasive argument is necessary 
where leadership deals with ideas or qualities. Publicity is often neces- 
sary and is very well distinguished in the text from propaganda. Note 
is made of the way that the leader becomes a symbol so that interests of 
the public shift from personal doings and characteristics to the cause for 
which he stands. With reference to objectives the leader and the follow- 
ers should have something in common. The author decries profit as the 
sole motive and urges corporate objectives with the subordinates joining 
in the creation of these objectives. 

The next three chapters discuss qualities necessary in leaders, such as 
physical and nervous energy, purpose and enthusiasm. Another quality 
which receives more stress than in most discussions is friendliness toward 
those who are led, thus predisposing them to do what the leader wants. 
The author suggests deliberately cultivating habits of friendliness by such 
procedures as chatting with the elevator man on the way up in the morn- 
ing and encouraging subordinates to come into the office for friendly 
visits. Other qualities are integrity, ability to make decisions, willing- 
ness to change one’s mind and technical mastery. This last quality leads 
to the worthwhile suggestion that some executive leaders may be devel- 
oped from the ranks and thus have the requisite technical knowledge. 
Intelligence is included as a requisite for leadership on experimental as 
well as observational grounds. Teaching skill is desirable and may sup- 
plant some ordering or dominating. The author does not go into the 
conventional laws of learning but mentions a few practical points that 
apply to the industrial situation. These include building up a desire to 
learn, relating the work to present knowledge, using the whole organism 
rather than some isolated part and taking adequate time for training. 
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Chapter 9 on Methods of Leading is another which will especially inter- 
est the industrial psychologist. With reference to giving orders we have 
concrete suggestions such as being clear, adapting oneself to various 
vocabularies, being explicit, using a good voice, showing no annoyance in 
the voice, being courteous, avoiding sharp imperatives or a bullying tone, 
not giving too many orders at once and minimizing negative commands 
because they are apt to be confusing and suggest the very thing we do 
not wish done. Unnecessary orders should be avoided. If it is feasible 
to make out a ‘‘ job ticket’’ for the entire day’s work the employee’s self- 
respect may be greater than that of the typical domestic employee who 
is continually interfered with on petty details. When giving reproof one 
should have the facts, should conduct the rebuke in private, without any 
anger, but the other workers should know about it afterward in order to 
realize that insubordination is not tolerated. Commendation, on the other 
hand, may well be given in public. The author has obviously encountered 
the psychological literature on praise and blame. The leader should also 
consider his personal bearing and if possible get a check on it from some 
friend. He should avoid a condescending attitude, ‘‘rattling’’ the em- 
ployee or becoming a ‘‘gloom artist.’’ This chapter includes a lot of 
valuable suggestions that would apply to any supervisory situation 
whether or not questions of leadership were paramount. 

Certain leadership situations are discussed such as being a conference 
chairman or an assistant leader. Then we have a consideration of haz- 
ards of leadership, centering around the exaggeration of various aspects 
of personality. Among these are love of power due to some compensatory 
mechanism, emotional instability, and obsessive fears; for instance, that 
one is not adequate for the job. Rationalization may lead to paternalism. 
The author stresses sex frustrations as carrying over into the attitude 
toward employees, particularly those of the opposite sex, and goes so far 
as to call the attitudes of some leaders sadistic. 

A chapter is devoted to women leaders. The reviewer would rather 
have some woman evaluate this chapter. The author seems to believe that 
normal feminine interests are in the home and family and unless these 
can be combined with the outside leadership career, greater difficulties 
aré encountered than in the case of the male leader. In other words, it 
is more difficult for the woman to have both a home and business life than 
it is for the man. Most of the difficulties which are presented as illustra- 
tive material apply more to the spinster than to the married woman in a 
position of leadership. 

{n connection with developing leaders the author very wisely suggests 
selecting them carefully at the outset. He recommends intelligence tests 
and suggests that, after further research, personality tests may have a 
very important bearing on this question. The reviewer heartily concurs 
in both of these points. The author also recommends training in psychol- 
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ogy as part of the program, but laments the fact that in the typical text- 
book ‘‘the whole person in action in typical situations seems to get lost 
to view.’’ The reviewer seriously suggests that the author ought to col- 
laborate with a psychologist in a text which would not have these faults. 
Several proposals are given for measuring results of leadership training, 
such as the volume of work or the quality of the work done by the group 
under supervision, instability of its membership, the number of complaints 
or grievances, and the actual opinions of the employees. These methods 
strike the reviewer as affording some valuable suggestions to those who 
are anxious to develop criteria for leadership against which to evaluate 
tests, ratings, scales and the like. At the end the author stresses again 
that good leading depends upon good following and, with a glance at the 
broader social field, hints at the danger that there may arise blind leaders 
of the blind. 

The author’s general philosophy of leadership is apparent throughout. 
He is thinking of it from a broader standpoint than merely securing 
action on the part of the subordinate and is stressing cooperation. While 
the book does not embody a great deal of specific psychological discussion, 
very many facts of this character are implicit in the work, and will be of 
interest to the applied psychologist. The terminology for the most part 
is common sense rather than technical—for instance, certain emotional 
attitudes are attributed to the ‘‘heart,’’ although psychological terms are 
used from time to time. The reviewer detects only one case of actual mis- 
use in which the concept of sadism is applied to some types of industrial 
persecution which have apparently no sexual characteristics. The psy- 
chologist will probably be most interested in Chapter 4 on ‘‘ How Leaders 
Influence Others,’’ and Chapter 9 on ‘‘Methods and Manners of Lead- 
ing,’’ although the intervening chapters which deal with characteristics 
of leaders will also be of considerable interest. 

Tead construes leadership to include almost all aspects of supervisory 
work. To this extent, portions of the book constitute a good discussiou 
of foremanship. Well-chosen examples are given throughout the work, 
drawn especially from business and to a lesser extent from politics. 
There is a refreshing absence of lengthy descriptions of Abraham Lin- 
coln, Martin Luther and Woodrow Wilson. The book is distinctly not 
inspirational, but descriptive and objective—far better than the sympa- 
thetic but sentimental biographical approach, so often found. It might 
serve as a preliminary manual for one who wanted to be a leader and 
would also constitute good reading for business executives who wished to 
scrutinize themselves a little more closely. The industrial psychologist 
will find in it much interesting illustrative material and it may help him 
to see some of his more academic principles in a practical setting. 

Harouip E. Burtt, 
Ohio State University. 
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Brain Fields and the Learning Process. J. A. GENGERELLI. Psychologi- 
eal Monographs. XLV, No.4. 1934. Pp. 115. 

Believing that the time is ripe for the formulation of a possible neuro- 

logical theory to explain the learning process, the author opens his mono- 


graph by assuming the following necessary postulates for his theory of 
learning: 


‘*1, The afferent excitation entering the cerebral mass diffuses 
throughout certain regions of that mass and remains in the form 
of an after-excitation for some time after the stimulus has been 
removed. 

. The intensity of the after-excitation varies as some inverse func- 
tion of the time which has elapsed since its inception. 

. For a given set of cortical conditions, the locus, intensity and 
spread of excitation and after-excitation in the cerebral mass are 
unequivocally related to the locus, the pattern and frequency of 
incoming afferent impulses. 

. The degree of polarization present influences the conductivity of 
the junction (synapse) between two juxtaposed neurones; and the 
passage of a train of impulses over a given synapse may alter the 
degree of polarization. 

. The synapse is a point of increased resistance, and the frequency 
which it transmits is never greater than the frequency transmitted 
to it by the centripetal fiber.’’ 


Further, assuming the crux of learning to be centered about the differ- 
ential conditions which make for the fixation or the extinction of a re- 
sponse, the author proceeds to apply his postulates to these two processes 
in the different phases of learning in their simplest forms. In the appli- 
cation of these postulates, he makes use of the membrane theory of inter- 
face potentials. Moreover, his theory involves the contention that nerve 
stimulation is primarily a matter of depolarization and from this conten- 
tion the conclusion is reached that ‘‘the smaller the amount of positive 
charges (within limits) on the outer surface of the membrane, the greater 
the speed of conduction; the greater the amount the less the conduc- 
tion. ...’’ The author is not especially interested in the mechanism of 
propagation but only in the fact that ‘‘ increased positive charges at the 
outer surface of the fiber membrane serve to increase the stimulation 
threshold and diminish the rate of conductivity. ...’’ With this princi- 
ple in mind he turns his attention to the synapse ‘‘as a region of increased 
‘resistance’ and diminished conductivity.’’ However, since the summa- 
tion of several impulses is usually necessary to depolarize the synaptic 
interface so that the wave of excitation may cross the synapse and effect 
a stimulation of the adjacent neurone, his problem then becomes essen- 
tially that of explaining the factors which change the degree of polariza- 
tion (under initial and repeated stimulation) at the synapse. ‘‘Obvi- 
ously,’’ says the author, ‘‘the amount of positive charge at the synapse 
may vary in two directions. It may either increase or diminish. If it 
increases, we have as a consequence of our preceding considerations a 
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diminution in irritability or conductivity of the synapse; if it diminishes, 
we have an increase in irritability and conductivity. Clearly, the one 
process moves toward extinction; the other, toward fixation.’’ One other 
point has to be considered, however, and that is, ‘‘at the synapse with 
each impulse there are liberated toward the outer surface of the membrane 
both negative and positive ions, and the net polarization effect there will 
be the algebraic resultant of these two qualities.’’ From preceding 
points it will, at once, be seen that the crux of the author’s theory of the 
brain fields underlying the learning process will be centered around the 
synaptic connections involved during that process. 

His theory, then, will stand or fall on, first, whether extinction and fixa- 
tion are the two most fundamental processes in learning; secondly, 
whether the synapse is as important in the two processes as he conceives, 
and thirdly, if the synapse does play the réle he thinks, has he analyzed 
rightly the factors he has brought into relief; if so, have all factors been 
included or are there others which are yet to be revealed? The author is 
aware of the fact that some factors have been neglected for simplicity’s 
sake. He then sets the following factors which he assumes to influence 
the polarization processes to values. Out of these factors set to values 
twelve formulae are developed to explain learning. Using the author’s 
own wording, the factors are: the amount of positive charge present at 
the synaptic interface ; the amount of positive charge released by a single 
impulse, the amount of negative charge released by a single impulse, im- 
pulse frequency with which x neurone bombards synapse, duration of 
impulse volley coming over x neurone and, limit of amount of negative 
charge which membrane will capture and hold. (Blair’s law of decay is 
also used.) Some of these factors are deduced from experimental find- 
ings; some are yet to be experimentally determined. Remembering the 
author ’s postulates and with the above listed factors we are then ready to 
follow his application of them to different phases of learning. The 
phenomena of diffused cortical excitation, after-excitation, adaptation, 
over-crowding at synapse, anticipation, polarization, depolarization, and 
repolarization are invoked in the explanation of ‘forward conditioning,’ 
‘experimental extinction,’ ‘trace responses,’ ‘conditioned inhibition,’ 
‘law of effect,’ ‘order of elimination of errors,’ ‘latent learning,’ ‘ retro- 
active inhibition,’ ‘spaced and unspaced learning,’ etc. 

Space in this review will not allow a detailed following of the author 
in his analysis of each phase of learning to which he has applied his postu- 
lates. It is an ingenious theory of what happens in the nervous system 
while learning is taking place. While, as the author points out, it is full 
of gaps, nevertheless the theory is full of possibilities. The work is 
scholarly, and the author has proved that he can logically develop a frame- 
work of the dynamics of brain action during the learning process. 
Whether he has interpreted correctly all present knowledge of nerve physi- 
ology and whether future facts will prove that his theory is right cannot 
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as yet be said. The author is to be commended, however, for his courage 
in pioneering in this field. Certainly his monograph should stimulate 
further inquiry into this important field. 
JaMES R. PATRICK, 
Ohio University. 


CHANDLER, ALBERT R.. Beauty and Human Nature. New York: D. 
Appleton-Century, 1934. Pp. 381. 

This book is intended as a text in psychological esthetics and fulfills 
a need for objectivity in this field. Its main contribution is the collection 
of experimental results from various sources on visual forms, color, music, 
literature, and rhythms of speech. It includes also an analysis of their 
elements and the elements of architecture, sculpture, pictorial art, and 
language. Music, including elements, structure, expressiveness, and tests 
of ability, receives more extended treatment than any other topic. A dis- 
cussion and description of artistic qualities and abilities, including tests 
of musical, art, and literary abilities receive secondary emphasis. The 
pleasantness and expressiveness of color rank third, while the pleasant- 
ness and expressiveness of visual forms are given fourth place—as to num- 
ber of pages. Importance is determined largely by the amount of experi- 
mental material available. Less than 40 per cent of the book is taken up 
with experimental material while the balance includes a description and 
analysis of various art forms, elements, and structure; varieties of 
esthetic experience; methods of studying esthetic experience, its impor- 
tance, general nature, and relation to other satisfactions; artistic qualities 
and abilities; and culture and appreciation. 

No new theories or experiments are included. Neither is an attempt 
made to form any laws of esthetic experience in general or for specific 
arts. Such esthetic concepts as balance and variety, repetition with varia- 
tion, combination and contrast are mentioned only hastily or not at all. 
Less than two pages (pp. 12-14) are given to an analysis of the esthetic 
experience and six pages (pp. 23-29) to esthetic categories. From this 
it will be seen that the author has intended the book to be only a struc- 
tural approach to the facts in the field; this he has accomplished with 
considerable thoroughness. Each chapter has a good bibliography and 
notes on cértain references. 

Ra.eicH M. Drake, 
Wesleyan College, Macon, Georgia. 


Sxaces, E.B. A Textbook of Experimental and Theoretical Psychology. 
Christopher Publishing House, Boston, Mass., 1935, $4.00. 

The author specifically points out three different schools of psychology, 
namely, what he calls ‘‘ mentalists,’’ ‘‘ behaviorists’’ and a third school 
which is a combination of these two. He professes to write this book 
from the point of view of this third school. 
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The treatment of several topics is too brief and vague to make this an 
adequate textbook of experimental and theoretical psychology. For ex- 
ample, the discussion concerning color-blindness and its measurement is 
one which omits the mention of a great deal of material which is relevant, 
and the treatment of Weber’s law ignores completely the highly signifi- 
cant experimental and theoretical implications of this law. 

There are omissions of several important bodies of material on various 
phases of general psychology, e¢.g., no mention is made of statistical 
measures, even the simplest measures of central tendency and dispersion. 

The manuscript for this book has been carelessly proofread for con- 
sistently in the general references at the end of chapters, Whittenberg is 
printed rather than Wittenberg and Janet’s book is given the title of 
‘*The Major Forms of Hysteria,’’ rather than ‘‘Symptoms of.’’ He 
attributes to Bayliss the authorship of a book on ‘‘ Principles of General 
Psychology.’’ At the end of the second chapter the title of Fulton’s 
**Muscular Contraction and the Reflex Control of Movement’’ is correct, 
but at the end of the fifteenth chapter it appears as ‘‘ Muscular Contrac- 
tion and the Control of Reflex Movement.’’ An article by M. B. Mitchell 
is referred to as if it were written by a man while the name of Mildred 
B. Mitchell appears in bold type at the beginning of the article. 

Within the text there is so much repetition that it serves as distraction 
rather than an aid to clarity. This book does not give the impression of 
being an important contribution to the long list of recent textbooks in the 
field. 

K. W. OBERLIN, 
University of Delaware. 


Kurt Lewin. A Dynamic Theory of Personality. Trans. by D. K. 
Adams and K. E. Zener. New York: McGraw-Hill Book Co., 1935. 
x+286 p. $3.00. 

Since the publication of the first and third chapters of this book in the 
Journal of General Psychology and the Handbook of Child Psychology, 
respectively, American psychologists have been interested in the work of 
Kurt Lewin. He offered a different dynamic and genetic approach to the 
problems of behavior. His position that statistical study of heterogene- 
ous groups did not and could not establish ‘‘laws’’ of behavior suggested 
possibilities of new ways of attacking experimental problems. These two 
articles, however stimulating, did not give a very complete picture of the 
possibilities of his systematic position. The present volume presents the 
first extensive account of Lewin’s work in English. 

The selected papers are those most pertinent to an explanation of his 
theories. The metaphysical position is set forth in the already well known 
paper contrasting Aristotelian and Galileian modes of thought. Lewin 
favors the latter which he says, in relation to dynamics, ‘‘ derives all its 
vectors not from single isolated objects, but * * * essentially, from the 
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momentary condition of the individual and the structure of the psycho- 
logical situation.’’ From this metaphysical position he derives his 
methodological approach which consists in the observation of behavior 
for the purpose of discovering laws having the same sort of universal 
validity that one finds, for example, in the law of falling bodies. This 
physical law is probably only approximated even in the most carefully 
controlled experiments; under all ordinary conditions deviations occur 
but they may be accounted for by the circumstances of the specific occa- 
sion. 

Likewise psychology is to establish ‘‘laws’’ of behavior in terms of 
environmental forces acting on the psychological person. Persons are 
structured totalities of systems, which vary in degree and kind of differ- 
entiation, in fluidity or rigidity of structuring, and in their content. The 
environment offers the means of satisfying the needs of the systems and 
the totality, but it also offers barriers to such satisfaction. The topologi- 
eal dynamics of these forces constitutes the essential field of search for 
psychology. 

The selected papers in this book show the operation of this position in 
theoretical discussion and experimental attack. The list of essay topics 
indicate the range: structure of mind, child behavior, reward and punish- 
ment, education, substitute activity, and a dynamic theory of the feeble- 
minded. As these represent papers written at different times for differ- 
ent occasions the book hardly presents a unitary, coherent development 
of a theoretical system. It does, however, show the systematic interpre- 
tation of a number of important problems. In the last chapter an at- 
tempt has been made to present in logical form an abstract of much of 
the experimental work of Lewin and his students. 

That Dr. Lewin’s theories and this particular translation of their 
presentation are of great importance to current psychological thought 
cannot be doubted. Whether this system will solve all of the psychologi- 
eal riddles can be answered only by the future. 

C. M. Lourtrr, 
Indiana University. 


THOMSON, WiLLIAM A. Making Millions Read and Buy. Walter Drey: 
New York, 1934. 248 pp. 

This book will not be of special interest to the average psychologist, 
as it is written more for the student of newspaper advertising. It deals 
with the problems, methods, and advantages of newspaper advertising. 
Mr. Thomson is thoroughly convinced that the newspaper has all the mer- 
its of other media in addition to some of its own. One cannot help shar- 
ing his enthusiasm. 

Several chapters deal with the make-up of the newspaper advertise- 
ment: copy, illustrations, use of news, frequency, market analysis. Also 
he discusses techniques of selling this form of advertising to manufactur- 
ers and retailers. 
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The book is written in typical journalistic style: racy, easy to read, and 
interesting. At first glance it seems chiefly to raise problems and give 
only the haziest answers. This alone would be valuable, as after all ad- 
vertising cannot be cut and dried; each advertiser has his own problems, 
and the factor of novelty is very important. But also one finds on closer 
inspection that there are a great many facts, figures, and illustrations 
which give concrete evidence on the points the author brings up. 

R. W. Husspanp, 
University of Wisconsin. 


J.C. FLANAGAN. Factor Analysis in the Study of Personality. Stanford 
University Press. 1935. Pp. 103. 

Even if it is true that! ‘‘ There is one department in psychology [per- 
sonality] in which no progress seems to have been made for about two 
thousand years . . .,’’ this book may well mark the beginning of a period 
of rapid progress if psychologists in the department of personality adopt 
and utilize the methods discussed and illustrated. Psychology would be 
placed soundly upon an objective scientific basis if statistical methods as 
advanced as factor analysis were used widely and correctly. 


This monograph [which is a Harvard doctoral thesis] aims to give a 
critical review of factor theory and its development, including the very 
recent contributions of Thurstone and Hotelling; to develop a general 
technique for constructing tests to measure independent components di- 
rectly; to provide a new method for determining the values to be attached 


to individual responses, namely, an iterative method of solving for re- 
gression coefficients without computing intercorrelations; and to present 
two measures of independent components representing almost all of the 
information contained in an original set for four variables.’’ (>. v.) 


In Chapter 1 the author discusses the early attempts to describe and 
classify individuals such as the attempt of alchemy to give a chemical 
base to individual differences, the search for a simple solution which re- 
sulted in numerology, astrology, graphology, palmistry, and physiognomy. 
Flanagan wisely points out the weaknesses of these attempts, their arbi- 
trary, subjective basis, their dependence upon memory of past experi- 
ence rather than recorded data, and upon a small number of observations 
which were not independent of each other. 

In evaluating later theories, the author states that ‘‘The ideal theory 
of personality would: 


. Define its elements without ambiguity and in terms of behavior. 

. Be founded on extensive and accurate observations. 

. Consist of basic elements which are independent. 

. Provide a simple explanation of the maximum number of well-estab- 
lished facts. 

. Have the maximum predictive value. 


A. A. Roback, The Psychology of Character. 
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By applying only the first two criteria, those most obviously essential to 
a sound theory, we find existing theories to be woefully weak.’’ (pp. 3-4.) 

Flanagan considers the tests which attempt to measure intelligence, 
character, temperament, perseverance, persistence, caution, initiative, 
‘*will-temperament’’ traits, honesty, service, co-operation, inhibition, and 
emotional stability, the ‘‘controlled observation test,’’ the individual 
questionaire or inventory and concludes that ‘‘the usual ratings have little 
direct value for the purpose of ascertaining the basic elements of per- 
sonality.’’ (p. 6.) 

To obtain independent elements, factor analysis techniques are necesssary 
according to the author who evidently expects these techniques to settle 
the question as to whether ‘‘ personality is made up of a few broad fac- 
tors or of many fairly specific ones,’’ about which ‘‘there is certainly no 
agreement .. . at the present time.’’ 

Chapter 2 contains an excellent ‘‘ brief survey of available factor analy- 
sis techniques.’’ The reviewer knows of no other source where the student 
of factor analysis can obtain at present such a critical comparison of 
Hotelling ’s and Thurstone’s techniques and an evaluation of the contribu- 
tions of the men who preceded them. 

Chapter 3, ‘‘The Analysis of Three Sets of Data by Hotelling’s 
Method,’’ is presumably included for illustrative purposes—to show the 
reader the results of factor analysis. It might have been better to analyze 
one set of data by both Thurstone’s method and Hotelling’s method rather 
than use the latter technique exclusively. The author believes Hotelling’s 
procedure to be the best and the reviewer concurs, but there are probably 
many psychologists who would not agree with Flanagan. Factor analysis 
has not been perfected yet and it is possible that further improvements will 
embody more of Thurstone’s technique. Hotelling determines from the 
roots percentages that add to 100%, and concludes? that the first factor 
accounts for 464% of the total variance, the second factor, 363%, the third, 
13%, and the fourth, 4%. This procedure allows no percentage or share 
for the unexplained part of the total variance. Really, the first factor 
accounts for 464% of that part of the total variance which can be ex- 


2 
plained by factor analysis. Thurstone uses = (which is similar to Hotel- 


ling ’s root divided by the number of tests) to obtain a value which mea- 
sures the importance of the factors. Thus it is not necessary to obtain all 
the factors before one can be evaluated nor is it necessary to claim that 
the factors explain all of the total variance (thus leaving a share to 
chance unexplained or not accounted for). Hotelling’s procedure is 


2H. Hotelling, ‘‘ Analysis of a Complex of Statistical Variables into 
Principal Components,’’ Jour, Ed. Psychol. XXIV (1933), 417-41, 498- 
520. 
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similar to the step adopted by Snedecor® in multiple correlation when he 
adds the betas (standard regression coefficients) and adjusts them so 
they will total 100, thus leaving no share or percentage to other variables 
not considered and chance fluctuations. Flanagan does not follow 
Hotelling so far as to determine as many factors as there are tests being 
analyzed, but after determining the loadings for three factors in the first 
illustration (involving eleven course grades) he adds a fourth column 
headed IV-—XI which contains the ‘‘. . . standard deviations of the por- 
tions of the original variables accounted for by the remaining factors 
.’’? Whether these values include the effect of random fluctuations is 
not clear, but if they do many of them (over half) are in error in the 
third (last) decimal place as may be shown by the fact that the sum of the 
squares of the factor loadings opposite any test (including the value in 
column IV—XI) should equal one or the reliability of the tests (whichever 
is used in the diagonal cell). The practice of determining percentages 
which measure the ‘‘contribution’’ of each factor needs further study as 
does the practice of determining coefficients of determination in multiple 
correlation. 
Chapter 4 traces the development of techniques ‘‘for obtaining uncor- 
related test scores.’’ The author’s summary cannot be improved: 


The pioneer work in devising a technique for weighting items has been 
done by Kelley. Starting from this basis and thinking in terms of N- 
dimensional space, where N is the number of individuals in the population, 
as Wilson has suggested, an iteration method for the solution of regres- 
sion equations has been proposed by the writer. Although mathematical 
proof of the convergence of this method has not been supplied, the method 
has been shown to give good results with a tremendous saving in labor for 
the type of problem encountered in test construction. The problem may 
be stated, in cases involving K tests and N individuals, as one of finding 
the vector in a space of K dimensions which most closely approximates 
the criterion vector, which is in a space of N dimensions. It would ap- 
pear that this problem would be solved when the approximations had been 
continued to the point at which none of the vectors representing the 
original tests in the K dimensions were closer to the criterion vector in N 
dimensions than to the vector representing the combination of tests in K 
dimensions. Since the cosine of the angle between vectors is the correlation 
coefficient, we may readily determine whether or not the solution has been 
obtained. In any event, as has been previously mentioned, the success of 
each approximation may be measured by means of the increase in the 
multiple-correlation coefficient. (p. 79.) 


The need for this iteration method is pointed out by the author: 

To obtain the weights for predicting one variable from a number of 
others we must, in general, calculate all the intercorrelations and solve 
the resulting simultaneous equations. Short-cuts have been developed 

8G. W. Snedecor (and H. A. Wallace), ‘‘ Correlation and Machine Com- 
putation,’’ p. 48. The relationship between factor analysis and multiple 
correlation should be studied more carefully. In an article already ac- 
cepted for publication in this journal the reviewer has applied the two 
techniques to the same data and compared the results. 
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which facilitate the solution by using product-moments in place of the cor- 
relation coefficients or by providing a more rapid solution of the simul- 
taneous equations by means of an iteration process, etc. However, if we 
were to attempt to solve for proper weights to be attached to the 250 in- 
dependent responses in order to obtain the best prediction of the criterion 
variable, we should find it necessary to compute 31,375 correlation coef- 
ficients or product-moments, not to mention the solution of 250 simul- 
taneous equations. We must seek a short-cut which, while giving us 
estimates close to the ‘‘ precise’’ values, is not too laborious. (pp. 50-51.) 


Estimates of the regression equation coefficients are obtained by noting 
that since they are ‘‘. . . determined by items at the extremes to a much 
greater extent than by items near the middle of the distribution, estimates 
of their values may be obtained with great decrease in labor and a much 
smaller decrease in efficiency by using only the tails of the distribution.’’ 
Table XI gives ‘‘ values of the product-moment correlation coefficient in a 
normal bivariate population corresponding to values of a and ¢ in a 
fourfold distribution composed of items in the tails of the dependent 
variable beyond plus- and minus-one standard deviation.’’ 

A minor criticism of the illustrations on page 56 arises out of the fact 
that Flanagan uses p’ and q’ to represent high and low values respectively 
on the y axis and reverses these letters so q and p represent high and low 
values on the x axis. 

There are four appendices, ‘‘The Method of Eliminating the Spurious 
Correlation Introduced by Intercorrelations between Errors, The De- 
rivation of the Formula for Determining the Variance of the New Un- 
correlated Scores Obtained by Hotelling’s Method of Principal Com- 
ponents, The Method of Obtaining the First Estimates for the Response 
Scores, A Detailed Outline of the Steps involved in Constructing Scor- 
ing Keys for the New Independent Variables Determined by Hotelling’s 
Method of Principal Components.’’ 

Factor analysis is a statistical technique which will grow in importance 
rapidly. It may be used advantageously in many fields into which it has 
not entered at present. For example, in political science, factor analysis 
may be used to study party uniformity in voting, the regional factor in 
voting, the religious influence in elections, etc. In economics, factor 
analysis may be used to study and isolate the common factors in the 
movements of commodity and stock prices and such analyses will change 
index number technique greatly. The importance of factor analysis in 
psychology will increase as research workers find that it enables them to 
discover and measure what is common in the responses of individuals to 
different situations. 

Harry PELLE HARTKEMEIER, 
University of Missouri. 
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